Microsoft has released a Kusto Query Language(KQL) challenge for us to solve. It consists of five different challenges.
After completing two cases we’re on fire! Lets just say, the next one is tricky… Challenge three –
“Bank robbery”
In short, this is what clues we get –
A bank located at 157th Ave/148 Street has been robbed
Police were too late and did not catch the robbers
08:17AM – The robbers enters the bank
08:31AM – The robbers leave the bank with money
08:40AM – Police arrives too late. They seal the city. Robbers cannot escape the city. Robbers leave the bank in three different cars.
Interesting… Let us construct our own story and key points
- We know that the cars are leaving the bank between 08:31AM – 08:40AM
- We know the bank is located on 157th Ave / 148 Street
Can we figure it out based on these key points?
Firstly lets take a look at the overall traffic to get information out what data the table has.
Traffic
In the table we have the columns Timestamp, VIN, Ave and Street and great amount of rows.
Lets begin by querying with the facts we know:
Traffic | where Timestamp between (datetime(2022-10-16 08:31:00Z)..datetime(2022-10-16 08:40:00Z)) | where Ave == 157 and Street == 148
This will limit our data to the facts timespan and ave/street. I then want to join project all the VIN-numbers that has moved within our data and join all the traffic back in based on VIN. So we get all traffic based on VIN-numbers that moved with our initial timestamp and street.
Traffic | where Timestamp between (datetime(2022-10-16 08:31:00Z)..datetime(2022-10-16 08:40:00Z)) | where Ave == 157 and Street == 148 | project VIN | join kind=inner (Traffic) on VIN
We then want to get our max timestamp by VIN-number within our selection by adding summarize.
Traffic | where Timestamp between (datetime(2022-10-16 08:31:00Z)..datetime(2022-10-16 08:40:00Z)) | where Ave == 157 and Street == 148 | project VIN | join kind=inner (Traffic) on VIN | summarize max(Timestamp) by VIN
This will give us the max_timestamp per VIN as well as VIN-numbers. Then again we want to add the traffic based on our new facts.
Traffic | where Timestamp between (datetime(2022-10-16 08:31:00Z)..datetime(2022-10-16 08:40:00Z)) | where Ave == 157 and Street == 148 | project VIN | join kind=inner (Traffic) on VIN | summarize max(Timestamp) by VIN | join kind=inner (Traffic) on VIN, $left.max_Timestamp == $right.Timestamp
Here we perform yet another join, but with a twist. We join the traffic back in based on VIN-numbers, but match the timestamps. Hence $left.max_Timestamp == $right.Timestamp. So now we have all the traffic with our max_timestamp and VIN within the timeframe initally defined. We know the three cars have left together, now we just need to figure out where they are!
Traffic | where Timestamp between (datetime(2022-10-16 08:31:00Z)..datetime(2022-10-16 08:40:00Z)) | where Ave == 157 and Street == 148 | project VIN | join kind=inner (Traffic) on VIN | summarize max(Timestamp) by VIN | join kind=inner (Traffic) on VIN, $left.max_Timestamp == $right.Timestamp | summarize count() by Ave,Street | where count_ >= 3
We summarize the the count of VIN-numbers in each Avenue and Street and filter on where the count of VIN-numbers is 3 or more. Voilà! We’re left with two combinations of Avenue and Street.
There are probably more efficient ways of solving this, without the joins. But it worked! Feel free to optimize the query.