Ottawa LRT design issues

There were 8.4 million New Yorkers and at one point, I thought there were 8.4 million traffic engineers because everyone had really strong opinions. – Janette Sadik-Khan

The east-west double-tracked Ottawa LRT (Confederation Line, Line 1) is well designed.  It has complete grade separation, which means it never intersects with car traffic.  It is electrically-powered.  It uses modern Communications-Based Train Control (CBTC), enabling trains to be safely run with low “headways” (trains arrive closely together in time), even though the trains are switching tracks at the end of the line (e.g. the east-bound train arriving at Blair turns into a west-bound train, heading back west for a while on the same track it arrived on).

CBTC means that trains are substantially automated.  The CBTC system keeps track of all of the train locations.  To some extent once you decide on the train spacing, it is CBTC that decides how fast trains go, which rail segments trains are on, and what the settings of the track switches are.

The rail operators are basically in an oversight mode, monitoring things and dealing with exceptions.

I am not an expert in train systems, but what follows is my analysis as someone who has done a lot of computer network system troubleshooting.

Cascade Failure

In the design of complex systems, you want to avoid a cascade failure.  That is, the failure of a single component should not cause the failure of most or all of the entire system.  In the context of a rail system:

failure of a single train should not have a significant impact on the rest of the rail system.

Unfortunately in this area of design, Ottawa’s Light Rail Transit (LRT) currently fails.  The fundamental issue is not the doors, the issue is

when a train is in door failure mode, it cannot move under control of the CBTC system.

That means a train in door failure mode has to be excluded from CBTC.  But since CBTC controls all the movement of all the trains, in order to safely exclude a train from control you have to remove an entire track from control, and manually drive the train back, slowly, to a safe location or maintenance location.

But you’ve then converted your dual-track system into a single-track system, with trains going both ways on a single track.  As you can imagine, that’s a situation where you have to be incredibly careful, and clearly you can only use the track in one direction at a time.  So that’s why you go from 5-minutes-or-less trains to 15-minutes-or-more, because there is very complex and careful management needed behind the scenes to safely use the single track.

In short, removing a train from CBTC is a serious failure of the system, that causes dramatically reduced system capacity and dramatically longer spacing between trains.  It should have been the case that RTG and OC Transpo and city staff and whomever is involved in making these decisions did a risk matrix, and the risk assessment should have identified situations in which a train would have to be excluded from CBTC control and tried to minimize them.

In this specific example, as long as you can manage the safety issues of a train that has a door failure, you should continue to run that train under CBTC.  And there could be lots of human ways to manage the safety issues, including staff in the train and staff on the platform, as well as station announcements.

UPDATE: Ken Woods saysCBTC no longer needs to be cut out when bypassing doors.”  So this should fix the major cascade issue that leads to big train delays.

Also, for those of you who want to know more about solutions than design problems, I have highlighted OC Transpo’s Plans at the end of this post.  END UPDATE

The Whole Story

Now that we know the end of the story, we need to go back to the beginning.  Every component of the LRT is a design decision.  (In fact, the decision to have light rail rather than heavy rail is a decision in and of itself.)  While the design that won’t let a train travel under CBTC control is a single decision with huge consequences, the design that led to the door issues is a much longer story.

UPDATE 2019-10-16: This blog post is specifically about how door failures were happening, and how door failures could lead to a system failure.  The trains are complex machines running complex software; there are many other ways they can fail that I don’t address below.  END UPDATE

>> Low-floor vs. High-floor

There are basically two kinds of modern train design: low-floor and high-floor.  The difference is basically whether the wheels (the bogies, in train terminology) are within the floor space (low floor) or below the floor space (high floor).  Most metro systems (e.g. New York) have high floor, and you also see this in VIA Rail trains, which is why you have to go up a set of steps from ground level to get into a VIA train.

High-floor means either you need steps to get onto the train (which is no good for modern accessibility) or you need high platforms, with the rails fairly deeply below the platform level (e.g. as you see in New York).  High-floor also means that the space inside the train car is completely unobstructed, it’s just an empty box you can arrange as you want.

Modern trams are designed to work at street level, because in many cases they’re running on regular streets mixed with car traffic.  Since you can’t reasonably design either deep trenches into the street or periodically raise the sidewalk up dramatically for tram stops, modern trams are low-floor.

Since Ottawa has a totally grade separated system with dedicated stations, we could have chosen high-floor, but we didn’t.

Ottawa uses a low-floor tram design, the Citadis Spirit.

That means the wheels reach up into the interior of the train cars, which is why you get those characteristic humps inside the trains, usually with seating on them.  It also means you get a narrow corridor between the humps (which is the spacing between the wheels on either side of the train).

>> Seating

There are basically two kinds of seating design for transit: front and rear-facing seats, like you see on a bus, and side seating (perimeter seating), like you see on a metro system (e.g., again, New York).  You would generally choose side seating for a high-capacity system, which means most people are standing, and front and rear-facing for a more comfortable but lower capacity system.

A critical difference is that you can clear side-seating cars quickly, because most people are standing anyway.  If it’s high-floor side-seating it is even better, because the space itself is completely unobstructed so people can move freely.

I’m not going to link to this article in DailyHive, because it has autoplay video – TransLink considering side seating for new order of 203 SkyTrain cars – but I will quote from it

perimeter seating generally provides more overall carrying capacity within each car with its allowance for greater standing room. This layout also creates more vertical and overhead handrails for standing passengers to hold on to.

With wider corridors between the seats, such a sideways seating layout could make the trains more efficient for quick ingress and egress through the car doors.

Kyoto subway 1117 priority seat area 01
Priority seat area on Kyoto Subway 1117

Ottawa’s trams have a fairly conventional front and rear-facing seat arrangement, which in combination with low-floor means that there are some pinch points within the train.  In the worst case to get off at a station you might have to stand up, walk through a narrow corridor between seats, and then make your way through the people standing next to the doors.

There are lots of human-factor considerations about which kinds of seating people prefer.  See e.g. Chicago magazine – subway seating options.

>> The Overhead Bar

In general you want people equally distributed along the interior of your train car.  But Ottawa’s cars have very high grab bars with no straps hanging down, which means lots of people can’t reach the bars to hold on.  If you’re standing, you really want to have the comfort of a handhold.  Because of this, people are clustering near areas where lower handholds are available.

>> I’ve Got a Door in my Pocket

There are two kinds of door design.  In a metro system (e.g. yes, New York again) you have pocket doors, where the doors slide into the side of the car.  In Ottawa’s trams, the doors instead open out and over, physically moving outside and to the side of the doorway.

I’m no door expert, but doors that run back and forth on a track, doors that are mostly inside the side of the train seem like they would have fewer failure modes than doors that have to go through a complex range of motion to close, and sit fully exposed to manipulation on the outside edges of the doorway.

Be mindful that doors of either design fail on transit systems all the time; it’s the most common failure.  See e.g.

Train Design Summary

So basically the train things under design control are:

  • high-floor vs. low-floor, which determines how much usable space you have inside the train car, and some of the ease of movement within the car
  • front-facing seating vs. side-seating, which determines how much open space you have for standing passengers vs. seated space, and decides to some extent how easy it is to move within the train car
  • grab bar and straphanger design, which will determine where people are comfortable standing
  • door design, which to some extent determines what kind of failure modes your doors may have

>> Dwell Time

The amount of time a train spends at a station is called dwell time.  To minimize the amount of time it takes get from one end of the entire rail line to the other, you need to minimize the amount of time each train spends at a station.  This includes minimizing the amount of time doors are open.  Ottawa’s LRT doors are on automated timings, I believe of less than two minutes per station.

>> The Bus Legacy

The bus is very different from the train.  The bus is an entirely human-driven system.  Everything is under human control.  The driver can decide when to stop, where to stop, when to open and re-open the doors, everything.

The bus is also super-jolty.  In particular if the brakes are jammed on at a stop or due to cars or other roadway dangers, or due to acceleration when departing a stop.

The bus is unpredictable due to cars, so it may arrive at a time different from the planned schedule.

The bus in many cases is infrequent, sometimes running only every half-hour.

These characteristics make for certain very understandable human behaviours on the bus.

First, people will often try to sit down as quickly as possible, in order to avoid being standing when the bus jolts into motion.  Also people will try to stand up as late as possible, in order to avoid being standing when the bus jolts to a halt.  And in general people would rather sit than stand, due to the aforementioned jolting plus added bumpiness of the ride in general.

This means people are quite often late to exit the bus, and just push the bars to hold or reopen the doors, or shout at the bus driver (“back door!”) to get the doors open if they’re outside the automatic door cycle.

So: slow to exit the bus.

The unpredictability of the bus means that when people do see a bus, they will run to grab it, often requiring the driver to stop and reopen the doors.

So: in a rush to get on the bus.

The flow of passengers on and off the bus is managed by a combination of bus driver oversight and passenger control, including people stepping off and back onto the bus if it is super crowded, with the passengers inside holding the door.

These behaviours don’t translate at all well to the train, which is nothing like the bus.  The train is smooth, and the doors are entirely on automatic timing, and (outside of major system failure) there is another train coming soon.

>> Platform Human Factors

On a platform, you want people to spread out so that they are using all the doors equally.  But it’s just human nature that people tend to stop as soon as they reach the bottom of the stairs.  It will take some time for people to switch from the bus mindset of basically standing near a single location to automatically moving down the platform.

>> Rush Hour

The Commute as we call it, Rush Hour, is an entirely separate design issue about which I will probably do a separate blog post.  But since we can’t redesign the entire city, and change school and employer expectations and employee behaviour, we’re stuck with this super-peak of demand called Rush Hour.

So you have to design your system for Rush Hour.

Normally what would happen is your system is adapted as Rush Hour grows.  Your system changes as demand changes.  If you do have to switch to a new system once demand reaches a certain level, you switch early, let’s say at 75% to 80% of capacity.

The huge design issue in Ottawa was that we waited until our system was basically at 100% capacity before switching to a new design, before switching from Bus Rapid Transit (BRT) to LRT.  (I will set the unfortunate politics of Canadian transit that caused this late shift aside.)

That means we went from a 100% capacity Rush Hour BRT system to an LRT system that would have very heavy peak demand.

Very heavy demand is a transit planner’s dream in some sense.

Unfortunately it would be very hard to simulate this level of real-world demand.  You’d have to have literally thousands of volunteers.

Another option would be to gradually ramp up demand, and adjust step by step.  But with a complex bus system this would have been tricky and confusing, with buses removed gradually over time.

Presumably transit planners thought that by instead running the entire bus system in parallel for three weeks, they were doing the best they could at gradual demand increase, assuming people would slowly transition from bus to train during those weeks, so that by the end they would be running at full train demand.

What seems to have happened though is a step change in demand when the buses were discontinued.

And Now The Deluge

That step change in demand seems to have triggered a cascade failure, which goes something like this:

1. Buses are now dropping off all of their passengers at Blair and Tunney’s, creating peak demand that needs to be quickly cleared.  So there are two big boarding demand sources.

2. People are mostly going to work or to school, which means heavy demand to get off the train at uOttawa and Parliament Stations.

3. With the extra number of people added by the end of the parallel bus service, the design of the trains means that people are not getting to the doors in time.  This is a combination of design factors:

a. low-floor design means there are some narrow corridors in the train cars
b. front and rear-facing seating means there are lots of people seated
c. having lots of seating means people have to navigate their way past the seats
d. high grab bars means that standing passengers cluster in certain areas of the car where they can reach a handhold

Plus which let’s keep in mind that humans don’t like “crush load“.  People will stand together, but they’re not necessarily going to pack themselves together to fill the maximum capacity of the train.

4. The automated door timing that is part of short dwell times at stations means people are not able to get out at their station before the doors close.

For a passenger, not getting off at your station is basically a serious transit failure when you’re on a train.  If you need to get off at uOttawa and you can’t get off, you’re stuck either trying to train back or walk back, when you may have timed things to get to class or an exam.  Train transit systems must be designed so people can get off at their stop.

When people can’t get off at their stop they will understandably panic and try to get the doors open.

5. The automated door timing that is part of short dwell times at stations means people may not be able to get on at their station before the doors close.

This is not a failure in a train transit system with frequent trains.  You wait for the next train.  But keep in mind this is not at all how the bus worked.  So people understandably are trying to rush onto the train, and they are trying to get the doors open to do so.

6. Getting off and on magnified

The next thing that will happen is if people have learned they may not get off and on the train as designed, they will change behaviours in ways that magnify the problem.

People worried about getting off will stand near the doors even if it isn’t their stop, which means they will slow people who need to get off the train.

People worried about getting on will stand near the doors and rush the doors, which means they will slow people who need to get off the train.

So as soon as people lose confidence in the train it gets even worse.

7. Apparently the doors can error out in two ways.  If the door tries to close three times and it can’t, it will error out, which the train operator can reset.  But if the door is physically out of alignment, this is basically an unfixable error until it is realigned.

This is where the real problem comes.  Failed doors are to be expected in transit systems.  But the train control (CBTC) system we have (Thales SelTrac™) in combination with the train we have (Alstom Citatis Spirit) has this decision in its risk matrix: a train with failed doors is not allowed to move under automated train control.

My thanks to Ken Woods (@drivesincircles) for explaining the train control issue on Twitter.

This was the eye-opening tweet for me:

He goes on to provide additional detail:

CBTC also enforces train suitability, meaning the CBTC system will not allow a train to move with a safety issue like an open door.  So, bypassing that safeguard requires us to bypass CBTC, making the train itself invisible to the other trains being controlled by CBTC.
— Ken Woods (@drivesincircles) October 9, 2019

Once bypassed, we have to move that invisible train back to the depot, and prevent other trains from getting too close. A bypassed train can only move at 25kph, and for obvious reasons cannot stop at stations.
— Ken Woods (@drivesincircles) October 9, 2019

I blinked too. The train will not authorize traction with an open door and an active VOBC. No idea if this is an Alstom or Thales thing, but the door loop override kills movement until the train is cut out.
— Ken Woods (@drivesincircles) October 9, 2019

(VOBC means Vehicle On-Board Computer.)

And this CBTC interaction with the trains is where everything falls apart.

Because the train can’t move under automated control, the entire less-than-5-minutes train system running on dual tracks gets replaced with a single track under automated control and a manual track where the train slowly chugs back to be repaired.

This means trains now arrive at something like 15 minute or longer intervals.

But Rush Hour peak demand means hundreds of people arriving continuously by bus at Blair and Tunney’s, which means they have to be cleared out on packed trains every 5 minutes or less, or the stations become dangerously crowded.

Which is what happened.

We should just be grateful that people managed this massive overcrowding.

UPDATE: Ken Woods saysCBTC no longer needs to be cut out when bypassing doors.”  So this should fix the major cascade issue that leads to big train delays.  END UPDATE

Immediate Fixes

>> Dwell Time

Probably the easiest fix is to change the dwell times for the trains, particularly at busy stations.  This means just change the door times so they are open longer.  This will mean longer end-to-end train rides for everyone, but should greatly reduce the door issues.

>> Boarding Decals

We could also put down boarding decals.  People in Ottawa love following signs.  But this means the doors have to always be at the same position, regardless of train direction or train configuration, and I’m not sure this is the case (you’d have to check with OC Transpo).

TTC platform decals
Decals from TTC expanding boarding decals test to southbound St George platform

>> Straphangers

Install straps so that people can stand safely anywhere in the train.

>> Buses

If we are still having capacity issues, OC Transpo needs to add buses back into the system.  Yes, I know this is a terrible option.  It seems like there are some peak demand points, so these could be e.g. express buses that go only Blair to uOttawa or only Tunney’s to Parliament.

Medium-Term Fixes

>> Change Automated Control Scenarios

The most important fix of all is to examine train failure modes, and re-evaluate when a train should be excluded from automated (CBTC) control.  The goal should be to find mitigations so that exclusion from automated control is minimized.

As long as it is safe to do so, it is way way better to run an empty train with a door fault under automated control, than to shut down an entire track and massively increase the time between trains.

UPDATE: Ken Woods saysCBTC no longer needs to be cut out when bypassing doors.”  So this should fix the major cascade issue that leads to big train delays.  END UPDATE

>> Min Headway

If we can get headways (spacing between trains) even lower, say 3 minutes or less, it will ease platform demand.  But I understand there are risks with pushing the system to tight spacing, and so this is something to set as a goal for the medium term.

However, headway is more complicated than it might initially seem, as the rail operators have to switch from one end of the train to the other at the end stations.

On a 4 minute headway, the train pulls in and leaves again within 240 seconds if it is precisely on time. It takes the operator 120 seconds to change ends. – Ken Woods

Long-Term Fixes

For Stage 2, Ottawa should look at side seating for the trains.

But this is a design decision with consequences too: it means long standing commutes for people coming from the most distant stations, and it means some people having to stand on totally uncrowded trains outside of Rush Hour.

Please feel free to correct me if anything above is wrong.

Context

The thing is, this really is mainly a combination of factors that cascade at Rush Hour.  The system is very pleasant and reliable outside of peak demand.  In a way there are two different systems, which makes for a challenging design problem.  If you’ve only been on the train during peak demand hours, you should try it out during the weekend or in the middle of the day; it’s quite a different experience.

OC Transpo’s Plans

The good news is OC Transpo knows what they are doing, so they are already putting in place some of the key elements above, including:

  • A plan to install strap hangers in trains;
  • Adjusting dwell times (the amount of time a door is kept open) at stations aligning the timing to passenger volume and train frequency times;
  • Installing markings on platforms guiding customers on where to wait, so as to not block customers who are stepping off trains. [So I guess train doors are at consistent locations after all.]

Thanks to Councillor Glen Gower for sharing this information in his blog (found via Twitter).

UPDATE 2019-10-16: OC Transpo has posted two letters explaining the situation and giving planned actions, one on October 8, 2019 and one on October 10, 2019.  The October 10 letter is the one extracted above.  As a reminder, this is fairly specific to door faults causing system problems; there are lots of other kinds of problems the trains can have that require them to go back to the maintenance depot.  END UPDATE

UPDATE 2019-10-18: OC Transpo has released a web page called The O-Line with details about train and related transit issues and planned solutions.

SIDEBAR: Train terminology

In The O-Line they talk about the Train Control Management System (TCMS) – which is what was referred to above as the Vehicle On-Board Computer (VOBC).  This is the computer on an individual train.  Like all computers it can have errors and need to be reset.  The computers on the individual trains are in turn controlled by the system-wide control system, the Communications-Based Train Control (CBTC).  END SIDEBAR

The O-Line covers some key topics including O-Train doors, improvements to stations, improvements to bus operations, and Winter Operations.

END UPDATE

I have to agree with this Canadian Press article: Despite Ottawa’s LRT woes, experts say don’t judge right away.