The system is designed to allow the proxies/drivers to generate a variety of records to detail the events that are occurring as traffic is processed. These can include:
- SQL statement execution
- Connections opening and closing
- JDBC Method invocation
- Console/Debug logs
As many of these categories can result in a very large amount of traffic, a variety of mechanisms are provided to control the volume of logs. Here is where each is controlled:
- Connection open and close: Only in the VDB tab, under logging
- JDBC Method invocation: In the VDB tab, under Debugging
- SQL Statement Execution: As an all on/off toggle under the VDB tab, under Logging, OR; via Log rules, with selective throttle controls
- Exceptions: Always generated
- Console Logs: These are always generated, but where they are sent can be controlled with the "Aggregate Console Logs" option under Logging in the VDB
- Debug Logs: These output as console logs, but control if they are generated is under the VDB Debugging tab with the "Verbose Debug Mode" option.
When using a log rule to perform SQL statement logging, there are a variety of options that can be used to limit the rate of logs. These are documented in the rules configuration documentation under "Rate Limiting" and can provide for a maximum log entries per second, or a ratio of logs processed vs. not processed. The rate limiting occurs at the proxy/driver level, and by reducing the rate of logging, this also reduces the amount of processing on the node generating the logs, so is highly recommended in production environments to limit the impact.
Global Log Management
In order to control the total size of the logs globally, there are two options available in the admin->Server properties section:
- Max Log Age: This is the number of days that log files and database log entries are allowed to persist. This defaults to 14 days.
- Reserved Disk Space: This sets how much space must be free at all times. During log rolling, SQL logs will be deleted until this value is achieved.
When cleaning up disk space, logs are deleted first based on the Max Log Age parameter, then based on the Reserved Disk Space value.
By default, logs are written to /opt/heimdall/log, however this can be configured. By adding into the javaOpts the property "heimdall.logdir", the relative directory can be controlled, i.e. "-Dheimdall.logdir=./log2/" will set the directory to use log2 for the flat file logs.
When analytics are to be used, we log data into an internal database on the management server, which summarizes the data in a concise format necessary for fast performance and minimal space and overhead. The minimum necessary to perform this logging is either:
- Enable the "Log All SQL" option in the VDB Logging section, or;
- Attach an enabled "Log" rule that matches the desired content to be logged.
In order to provide detailed logging, an additional option is available in the VDB tab, to "Write logs to file", which will trigger all query logs for the VDB to not only be registered to our internal database, but to also be logged individually in a CSV formatted file in the $serverhome/log directory, and follow the following format (CSV, and can be opened in a spreadsheet if desired):
- epoch-ms: Log timestamp (ms since epoch)
- type: Event type
- TRACE: Generated with trace actions
- DONE_SQL: Generated with log actions
- CONNECTION_OPEN: Generated if log connections set
- CONNECTION_CLOSE: Generated if log connections set
- METHOD_COMPLETE: Generated if log methods set
- TRANSACTION_START: Generated if log methods set
- TRANSACTION_END: Generated if log methods set
- DEBUG: Generated in rare error cases
- EXCEPTION: When a query or other exception occurs
- NEW_PATTERN: Triggered when observing a new query pattern
- NOTIFY: For rules tagged with a notify
- CONSOLELOG: When console log aggregation is enabled, this provides the log data
- flags: Status flag
- 0: Cache miss
- 1: Cache hit
- 2: Cache miss (in transaction)
- 3: Cache hit (in transaction)
- User: JDBC Username used to connect to the db
- vdb: VDB name
- hostname: Hostname of the source
- instance-id: A unique ID for each JVM, to provide uniqueness even on the same host
- conn-id: Session id (uniquely assigned from server to identify different users)
- class: Log entry source method type:
- C: connection
- S: statement
- R: resultSet
- P: prepared
- X: callable
- D: Driver level
- B: Data Source
- U: Unspecified type
- obj-id: Object id (semi-unique identifier for given object in this session)
- parent-id: Parent id (id of parent object. E.g. If ResultSet, parent id would be id of statement that created it)
- runtime-us: Runtime in microseconds i.e. response time of operation generally. For connection close events is total time connection was open, and for TRANSACTION_END, it is the time since the last TRANSACTION_END or the TRANSACTION_START.
- db-name: Data source (may be the name of the LB source)
- Message: Event text. For both the log (done_sql) and trace (trace) event types, all of the following fields will be inserted, but for the done_sql, many will be empty values, as the data is not tracked for this type to lower overhead. Other record types include text that represents what was done, i.e. the method, etc.
- rows: number of rows
- rows-read: number of rows not read
- result-size: Estimate of raw byte size of Result
- java-size: Estimate of Java byte size of Result (including object overhead)
- proctime-ms: Overall time (all rows processing time)
- result-hash: Pattern hash used by Heimdall, to associate similar queries
- text: text field based on the record type--this can change between releases fairly often.
SQL statement & Parameters The SQL statement is logged in the format of “Statement~num parameters~param1~param2~”. Each parameter is replaced with a ? in the statement. The target is to simplify the parsing of query parameters, yet insure that if log records are sorted based on this portion of the message, than unique queries will be sorted together. This format also makes stripping potentially sensitive information from the records when needed.
Note: The log fields may change between major revisions of code. Each file contains a header that provides the position of each field, and should be used for parsing of the file to insure that additional or rearranged fields do not break any code depending on the format of the log file.
Example Log Entries
#Fields: epoch-ms, type, flags, user, vdb, hostname, instance-id, conn-id, class, obj-id, parent-id, runtime-us, db-name, rows, rows-read, result-size, java-size, proctime-ms, user, result-hash, text
1523556866884, TRACE, 0, magento, Magento-Demo-vdb, ip-172-31-56-200, 94f7f042-7bd5-41a5-bb49-091877788fde, 0, R, 469576, 11812, 1616, Magento-Demo-source-Master, 1,1,65,425,1740,4BC71F67001332175D27AFF533095BB2,magento,magento.test,SHOW TABLE STATUS LIKE ?~1~\"'core_resource'\"~,
1523556866892, TRACE, 0, magento, Magento-Demo-vdb, ip-172-31-56-200, 94f7f042-7bd5-41a5-bb49-091877788fde, 0, R, 469577, 11812, 7420, Magento-Demo-source-Reader, 68,68,1925,6005,7584,EBF219E0CA457EC7C9A9255C33A25908,magento,magento.core_resource;magento.test,SELECT
1523556866894, TRACE, 0, magento, Magento-Demo-vdb, ip-172-31-56-200, 94f7f042-7bd5-41a5-bb49-091877788fde, 0, R, 469578, 11812, 803, Magento-Demo-source-Reader, 9,9,177,357,862,1AF8150173C7C1B330EE8434132A65FB,magento,magento.test;magento.index_process,SELECT