Sqlite over ALBLogs

This is a short on my journey to do some simple log processing. My client had a backend issue with a particular request path, and I was able to identify the source of errors and the source of slowness pretty easily with this.


Identify S3 bucket and copy locally

First, you’ve got to identify where your S3 logs are. This is really easy from the console; Click the load balancer, check the attributes section, and see where it is.

Alternatively, from the cli:

# List load balancers
aws elbv2 describe-load-balancers
aws elbv2 describe-load-balancers  | jq -r '.LoadBalancers[] | (.LoadBalancerName + " " + .LoadBalancerArn)'
# Look for access_logs.s3.bucket and access_logs.s3.prefix
aws elbv2 describe-load-balancer-attributes --load-balancer-arn ARN
aws elbv2 describe-load-balancer-attributes --load-balancer-arn ARN | jq '.Attributes[] | select( (.Key == "access_logs.s3.bucket" or .Key == "access_logs.s3.prefix"))'

Once you’ve got the location, you can copy it down with

aws s3 sync s3://lb-logs/prefix/date/ scratch/alblogs

Unpack and Concat

Now you’ve got a big directory full of small logfiles. Unpack them and put them all in one big file for sqlite import:

zcat * > all.log

Create SQL

First, create the schema for your ALBLogs.

CREATE TABLE IF NOT EXISTS "alblogs"(
 "type" TEXT, "time" TEXT, "elb" TEXT, "client_port" TEXT,
 "target_port" TEXT, "request_processing_time" int, "target_processing_time" int, "response_processing_time" int,
 "elb_status_code" TEXT, "target_status_code" TEXT, "received_bytes" int, "sent_bytes" int,
 "request" TEXT, "user_agent" TEXT, "ssl_cipher" TEXT, "ssl_protocol" TEXT,
 "target_group_arn" TEXT, "trace_id" TEXT, "domain_name" TEXT, "chosen_cert_arn" TEXT,
 "matched_rule_priority" TEXT, "request_creation_time" TEXT, "actions_executed" TEXT, "redirect_url" TEXT,
 "error_reason" TEXT, "target_port_list" TEXT, "target_status_code_list" TEXT, "classification" TEXT,
 "classification_reason" TEXT, "conn_trace_id" TEXT,
 "transformed_host" TEXT, "transformed_uri" TEXT, "request_transform_status" TEXT
);

Then, import your logfile into a table:

.separator " "
.import "all.log" alblogs

For really useful work, you’ll likely need a regex implementation. Depending on your distro, you might already have one installed. I specifically installed sqlite3-pcre. You activate one like:

.load /usr/lib/libsqlite3-pcre.so

That’s it! You’re now ready to query your logs.

select request, target_processing_time from alblogs order by target_processing_time desc limit 10;

Leave a Reply