The total distance traveled by all ships:
SELECT SUM(length(Trip)) FROM Ships; -- 807319558.5805709
This query uses the length
function to compute per trip the sailing distance in meters. We then aggregate over all trips and calculate the sum. Let's have a more detailed look, and generate a histogram of trip lengths:
WITH buckets (bucketNo, RangeKM) AS ( SELECT 1, floatspan '[0, 0]' UNION SELECT 2, floatspan '(0, 50)' UNION SELECT 3, floatspan '[50, 100)' UNION SELECT 4, floatspan '[100, 200)' UNION SELECT 5, floatspan '[200, 500)' UNION SELECT 6, floatspan '[500, 1500)' UNION SELECT 7, floatspan '[1500, 10000)' ), histogram AS ( SELECT bucketNo, RangeKM, count(MMSI) as freq FROM buckets left outer join Ships on (length(Trip)/1000) <@ RangeKM GROUP BY bucketNo, RangeKM ORDER BY bucketNo, RangeKM ) SELECT bucketNo, RangeKM, freq, repeat('▪', ( freq::float / max(freq) OVER () * 30 )::int ) AS bar FROM histogram; -- Total query runtime: 2.8 secs bucketno | rangekm | freq | bar ----------+---------------+------+-------------------------------- 1 | [0, 0] | 796 | ▪▪▪▪▪▪ 2 | (0, 50) | 3752 | ▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪▪ 3 | [50, 100) | 571 | ▪▪▪▪▪ 4 | [100, 200) | 452 | ▪▪▪▪ 5 | [200, 500) | 569 | ▪▪▪▪▪ 6 | [500, 1500) | 100 | ▪ 7 | [1500, 10000) | 22 | (7 rows)
Surprisingly there are trips with zero length. These are clearly noise that can be deleted. Also there are very many short trips, that are less than 50 km long. On the other hand, there are few long trips that are more than 1,500 km long. They look like noise. Normally one should validate more, but to simplify this module, we consider them as noise, and delete them.
DELETE FROM Ships WHERE length(Trip) = 0 OR length(Trip) >= 1500000; -- DELETE 820
Now the Ships
table looks like Figure 1.3, “Ship trajectories after filtering”.
Let's have a look at the speed of the ships. There are two speed values in the data; the speed calculated from the spatiotemporal trajectory speed(Trip)
, and the SOG
attribute. Optimally, the two will be the same. A small variance would still be OK, because of sensor errors. Note that both are temporal floats. In the next query, we compare the averages of the two speed values for every ship:
SELECT ABS(twavg(SOG) * 1.852 - twavg(speed(Trip))* 3.6 ) SpeedDifference FROM Ships WHERE SOG IS NOT NULL AND ABS(twavg(SOG) * 1.852 - twavg(speed(Trip))* 3.6 ) > 10.0 ORDER BY SpeedDifference DESC; speeddifference ------------------- 241.42049907716665 134.61257387558112 112.36643046964278 110.10490793777619 81.66118732332465 81.5725669336415 69.85832834619002 57.232404771295045 52.943341619001586 52.921746684116904 ...
The twavg
computes a time-weighted average of a temporal float. It basically computes the area under the curve, then divides it by the time duration of the temporal float. By doing so, the speed values that remain for longer durations affect the average more than those that remain for shorter durations. Note that SOG is in knot, and Speed(Trip) is in m/s. The query converts both to km/h.
The query shows the first 10 ship trajectories of the 82 in the table that have a difference of more than 10 km/h. These trajectories are shown in Figure 1.4, “Ship trajectories with big difference between speed(Trip)
and SOG
”. Again they look like noise, so we remove them with the following query
DELETE FROM Ships WHERE ABS(twavg(SOG) * 1.852 - twavg(speed(Trip))* 3.6 ) > 10;
Now we do a similar comparison between the calculated azimuth from the spatiotemporal trajectory, and the attribute COG:
SELECT ABS(twavg(COG) - twavg(azimuth(Trip)) * 180.0/pi()) AzimuthDifference FROM Ships WHERE ABS(twavg(COG) - twavg(azimuth(Trip)) * 180.0/pi()) > 45.0 ORDER BY AzimuthDifference DESC; azimuthdifference -------------------- 355.4200584570843 348.213417943632 333.7458943572906 321.5644829906112 309.6935360677792 308.4444213637132 295.5019204058323 294.7215887580075 267.8656764337898 267.09343294055583 ...
Here we see that the COG is not as accurate as was the case for the SOG attribute. More than 1600 trajectories have an azimuth difference bigger than 45 degrees. Figure 1.5, “Ship trajectories with big difference between azimuth(Trip)
and COG
” visualizes them. Some of them look like noise, but some look fine. For simplicity, we keep them all.