Technical Analysis (Engineering) of NTSB Preliminary Report M/V Dali

Under 1.4 in the very first message it should read DG2 instead of DG3.

As hint about the crew, sister ship M/V Cézanne was manned by a crew of 25 Indian nationals incl. Master about 5 years ago.

Here some addditional thoughts, not structured at all.

Could the blackouts been have kept shorter? Good question…

How long can the ME (Main Engine) ride through a power outage at the here relevant speeds?
How is the windmilling effect on the propeller when propulsion is lost vs. speed?
Especially referring to no longer powered lubrication pumps and cooling water pumps, assuming that all controls remain powered.
Note that the publicly available engine designation doest not reflect further option details.

Can the ME tolerate let’s say 3 seconds of power loss, about the time required to reconfigure a modern power distribution e.g. based on a segmented double busbar ring. Advanced digital protection systems allow a high selectivity, i.e. reducing the risk that a single fault causes other subsequent issues.
For example, if a generator is lost, non-critical loads are immediately shed and the most critical loads are sequentially brought back online according to process requirements it they lost power.
Typically Dynamic Positioning may require a very high availability which can only be achieved by fault-tolerant architectures and therefore no single point of failure is acceptable.

Could the 1st blackout have been avoided?
We cannot know but its duration could have been reduced to at most a few seconds if performed by automation, the time required to bring back online TR1.
Anyway, HR2 and LR2 were closed manually by crew intervention.

As the 1st blackout happened, simply closing manually HR2 and LR2 would have fully restored power. An automated system would have exactly done that, the rationale being:
There is a problem with TR1 as it was disconnected automatically by tripping HR1 and LR1, so let’s use TR2 which is exactly there to replace TR1 when it fails (only TR1 or TR2 is in use at any time, they are not operated in parallel according to the PR).
Duration of the blackout? Just a few seconds with automation though it depends on how TR2 is magnetized as it is fairly large transformer sized to supply alone the whole LV BUS any time.
Without a detailed power distribution diagram it is impossible to know exactly how large (in kVA) TR1 and TR2 are, logically they should be identical.

The 3000 kW bow thruster BT is fed by the HV BUS (in the PR diagram it corresponds only to the two horizontal segments left and right of the bust tie breaker HVR, the gray shaded rectange has been choses sort of arbitrarily and does include much more than the bus itself).
As the BT propeller pitch is variable, the 3000 kW only represent the maximun power, also when manoeuvering reefer container load can be shed a few minutes if required.

The quite high maximum lpower required for the reefer containers is supplied by an unknown number of 6600 V/440 V transformers not part of the diagram. The effectively required power can widely vary depending on the refrigerated goods transported and the ambient temperature as reefer containers are basically insulated ISO containers with a refrigeration compressor.

If HR1 tripped due to an overload of TR1 (and LR1 to avoid TR1 being energized in reverse, i.e. from the LV BUS and LR1), once brought online TR2 would also trip if ratings are identical and the overload remains identical but the delay until the digital overload protection function or direct thermal protection based on transformer temperature measurement will trip HR2 will be longer as TR2 was cold when put online. In such case it is very possible that TR2 would have occurred after the allision.
Of course we don’t know if there was an overload, it is just given as example.
As DG3 and DG4 both remained connected to the HV BUS, the reason why HR1 tripped is not related to a HV problem of that bus at that specific time.
If not too severe, too long or too frequent, overloads are most often not a huge issue, most important is how the transformer can be cooled as increased temperatures reduce its lifetime.
Some main service ship transformaters are tightly sized which explains why occasional overload during normal use can occur as cost constraints are high.
Possibly harmonics may become a problem as there are more non-linear loads than in the past.

The other cause would be a short circuit which is simply speaking a very massive sudden overload. This is considered as fault and will damage the transformer very quicky if power is not removed. There are zillions of standards which specify the requirements of nearly all electrical equipment and they also depend on regions.

Currently we don’t know why TR1 tripped, overload would not be very surprising as it can happen, short circuit (internal or load-side) and other causes like insulation fault are less common.
Very plausible are human errors as well as malfunctions of protection relays and/or power management systems.

I forgot to mention that to avoid reverse powering the transformers, if the supply-side breaker (HR1 or HR2) trips, the corresponding load-side breaker also trips (LR1 or LR2), at least it is typically handled that way although the PR does not mention it.

As the 2nd blackout happened there were 2 possible scenarios:

  1. DG3 and DG4 were still available

It is unknown if DG3 and DG4 were shut down after DGR3 and DGR4 tripped, tripping a generator does not necessarily require its driving diesel engine to be shut down. Tripping several parallel running generators at the same time can happen due to overload.
Overload of a single generator operated in parallel with one or more other generators (of similar or different sizes (in kVA)) can occur e.g. if there is a load sharing problem. The power each generator produces when running in parallel is managed by the load sharing control function of the digital generator control (here rather crude Hyundai devices, I didn’t see any Deif or Woodward). If for any reason one generator does no longer provide enough power, other generators can end overloaded and ultimately all generators may trip.

Technically speaking an overload could have first tripped TR1 causing the first blackout and later cause the trip of DG3 and DG4 though TR1 could also have tripped again, so it would depend on how exactly the protections have been set though TR1 and TR2 should both be each sized in order to step down the power generated by somewhat more than two generators, maybe 2.5 or 3 but here I’m guessing (one part of the generated power is not stepped down by TR1 or TR2 when the bow thruster BT is operating and/or when reefer containers require power).

If the fault can be cleared it can be immediately tried to reconnect DG3 and/or DG4 as the generator controller will not close the generator breaker if some conditions are not met (HV BUS is dead + Voltage and Frequency within limits and stable; or ready to complete synchronization and, if applicbale, enable control signal from additional synchrocheck is present).
In such case the delay would have been a few seconds once the reconnection is initiated manually for one DG (no sync required).

  1. Nor DG3 nor DG4 is available

DG2 started automatically anyway as it was in stand-by and was automatically connected to the HV BUS as DGR2 was closed automatically which powered the HV BUS. It is totally illogical that TR2 was not automatically put into service by closing automatically HR2 and closing automatically LR2 not more than a few seconds after DGR2 was closed automatically.

In any case a good power automation system would have been able to either reconnect DG3 and DG4 if still running correctly (which would have shortened the 2nd blackout to a few seconds) or to close DGR2 automatically shortly after DG2 reached stable 720 RPM (60 Hz) and 6600 V. Possibe DG warm-up delay should be overridable in case of emergency, similar to the ME where some protections are disabled by pushing a special emergency button.
As DG2 was in stand-by it was kept warm anyway, ready to be started any time.
As example, some critical power emergency generators with large high-speed diesel engines are fully loaded in a single step within 10 seconds after being started.

Although the PR is not clear about some details, the provided information clearly proves that the first blackout could have been ended in a few seconds by closing manually HR2 and LR2 (by remote order, no operating the breakers manually locally). Maybe taking in account some reaction delay of some 30 seconds.
As said, automation could have reduced the 1st blackout to a few seconds.

The second blackout was handled very well by the crew. The transformer TR2, which should have been connected automatically, was very quickly powered manually by the crew who closed both HR2 and LR2 manually only 31 seconds after DG2 had begun its start sequence which was initiated automatically as HV BUS went dead.
It’s not even sure that automation would have been faster as possibly the crew was monitoring the RPM and voltage of the starting DG2 and just waited until conditions were met to energize TR2 manually. Inadvertendly prematurely closing HR2 would normally have been prevented by a logical safety interlock,

Would the blackouts have beem avoidable? Maybe, mabye not but at least the first one could have been kept much shorter with automation. Unfortunately the power management system is rather primitive and unable to reconfigure the power distribution within seconds.
Merchant ships are antique when it comes to integration, automation and SCADA compared to top industries, only single devices have modern mostly local controls and small Operator Panels but the layout of the engine control room is not really modern and makes it difficult to react quickly, also it does not allow a good overview, there are lots of very different displays, indicators and controls for many separate systems. Not to mention possibly confusing audible alarms (applies also to the bridge (wheelshouse)), a buzzer is not very useful if no one knows what it refers to. Maybe a voice alarm message like in cockpits would be better.
The engine control room ergonomy is poor and makes running such complex systems less easy than it could be.
More integrated up-to-date very reliable engine control room designs are possible, also some less basic automation will reduce operator errors and enhance safety. While many local operator panels are inavoidable as many local control systems will remain, interfacing them in order to be able to operate them, or at least access read-only data centrally will become more and more important. Ideally it should be relied on existing interface standards. Some dedicated displays like for the ME (MOP’s) will still remain but a myriad of small displays can be replaced by screens but of course all important controls must remain discrete (pushbuttons and other switches, rotary encoders, important pilot light,…).
Many SCADA and process control systems are not well designed because those who engineered them has no idea about how they will be used by real people in a real environment.

For the 2nd blackout there is not enough information to know if DG3 and/or DG4 could have been reconnected immediately or not. If not, the crew did about the best possible in reconnecting TR2 within only 31 seconds.
That is extremely short when stressed in a control room of a ship which has lost propulsion.

I do not know if the crew did some mistakes which caused the first blackout but, considering the timeline, kudos to the crew who handled the power generation and distribution issues, especially in such a critical situation.
It is easy to comment the incident watching a YouTube video, while indeed the elapsed time between the first power outage and the allision was very short and also.

I know I shouldn’t address it here but consider that the first ones to blame are those who are reluctant to invest maybe just even one additional % of the new ship price (maybe rougly around 150 to 190 miillions US dollars for large container ship) as it would make a HUGE difference.

Overall I consider that this was an unfortunate accident.

3 Likes