Programmers - why so many bugs?

Blue highway · Mar 21, 2023

grinder said:
I am sure a few more OTAs will really help. The MME is still new for Ford. We just have to give them time.

The platform is 4 years old... at what point is it no longer new?

mburtsvt · Mar 21, 2023

Because too much technology is trying to fix not enough problems.

worachj · Mar 21, 2023

Main reason is you have to WANT to do correctly. Ford needs greater commitment to software and is doing it poorly.

A-A-Ron · Mar 21, 2023

LM_COGT said:
Disclaimer: this is external visibility, there's lots of things that could be true internally, but these are best guesses based on my time in hardware/software platform development.

0) Very likely a management structure that mistakenly thinks this is about "programmers." Software engineering, especially software designed for critical function on proprietary hardware, is a large systems engineering problem, and requires a multi-discipline approach to achieve success, programmers are an important piece, but just one piece of many.

1) Poor or missing Integration and Test capability. For quality releases on a routine schedule the integration and test areas are critical. How easy is it for a given software engineer to produce a fix, compile it to an engineering build, and run basic validation? Is there a continuous integration (CI) process in place ensuring that every commit to the codebase is regressed? Is that regression suite evolving and dynamic to cover the key issues learned from lab and field? Is there a high quality lab emulation of the platform that enables rapid and repeatable testability? On the culture side, is the test team treated as a key pillar in the engineering platform, or are they the annoying voices that keep complaining and dev engineers try to ignore?

2) Lack of well-documented and efficient process for creating and releasing product lines and patching those lines. How do program leads efficiently determine the right changelist to go into a particular release? Is there sufficient visibility into individual changes that enable for rapid assembly of release candidates based on clearly communicated release goals?

3) Weak or missing systems engineering group. This is a highly complex system, hardware, software, mechanical, UX. Having a team of engineers dedicated to the understanding of the end-to-end product and how multiple pieces fit together enables specialists in the different areas to focus on their strentghs, while being provided the support to understand how to fit within the larger picture. This has been well-known since early rocketry days in many places, but I have seen that often there is a blind spot in the org structure where software is not included in the systems engineering process, which is big mistake, software is a critical glue in modern systems and needs to be provided with the support of dedicated systems engineering resources to work best together with other teams.

As a systems engineer with the last decade+ spent in Integration and Test, I 100% agree with this assessment. I don't have any internal visibility into Ford, but everything I see from the outside suggests a lack of or weak systems engineering to tie everything together and inadequate testing. I think the DevOps (your area) for a connected car like this was way more complex than Ford anticipated and pulling together the myriad of baselines, different hardware configurations, multiple country configs, languages, chip shortages and everything else has just gotten away from them and it's really, really hard to pull that back together (evidence: I think they're trying to get baselines in sync with the 4.2.x same as the 4.1.x releases they already deployed to pre-23s). It definitely ties back to Management being unable to handle a project this size, and I think the slow updates and sometimes overly-cautious functionality (Hands Free Blue Cruise) shows a very risk adverse management as well.

TL;DR: There's a lot more software in this car than previous Ford's and I think Ford was completely unprepared for the size, scope, and complexity of this and they don't know how to run a large, agile software development.

RobbertPatrison · Mar 21, 2023

I have been writing software systems for decades. With millions of lines of code running behind the scenes, think it is a miracle that there is software with so few bugs

The complexity is mind-boggling. When a few packages are glued together via APIs it is very hard to cover all the corner cases.

Compared to my previous EVs, the MME does seem better with fewer random crashes and more reliable CarPlay. My main complaint is very unreliable PAAK, which is likely a Bluetooth hardware/firmware issue.

Compared to the GM app, the FordPass app is clumsy but it does work quite reliably. When PAAK fails, clicking unlock on the Apple Watch app is the fastest way to open the car.

DadzBoyz · Mar 21, 2023

LM_COGT said:
Disclaimer: this is external visibility, there's lots of things that could be true internally, but these are best guesses based on my time in hardware/software platform development.

0) Very likely a management structure that mistakenly thinks this is about "programmers." Software engineering, especially software designed for critical function on proprietary hardware, is a large systems engineering problem, and requires a multi-discipline approach to achieve success, programmers are an important piece, but just one piece of many.

1) Poor or missing Integration and Test capability. For quality releases on a routine schedule the integration and test areas are critical. How easy is it for a given software engineer to produce a fix, compile it to an engineering build, and run basic validation? Is there a continuous integration (CI) process in place ensuring that every commit to the codebase is regressed? Is that regression suite evolving and dynamic to cover the key issues learned from lab and field? Is there a high quality lab emulation of the platform that enables rapid and repeatable testability? On the culture side, is the test team treated as a key pillar in the engineering platform, or are they the annoying voices that keep complaining and dev engineers try to ignore?

2) Lack of well-documented and efficient process for creating and releasing product lines and patching those lines. How do program leads efficiently determine the right changelist to go into a particular release? Is there sufficient visibility into individual changes that enable for rapid assembly of release candidates based on clearly communicated release goals?

3) Weak or missing systems engineering group. This is a highly complex system, hardware, software, mechanical, UX. Having a team of engineers dedicated to the understanding of the end-to-end product and how multiple pieces fit together enables specialists in the different areas to focus on their strentghs, while being provided the support to understand how to fit within the larger picture. This has been well-known since early rocketry days in many places, but I have seen that often there is a blind spot in the org structure where software is not included in the systems engineering process, which is big mistake, software is a critical glue in modern systems and needs to be provided with the support of dedicated systems engineering resources to work best together with other teams.

I concur with most of this.

I would add a couple of things.

Sync is built on a third party software platform. Think of it as the difference between Apple IOS and Google Android phones.

Apple builds the software and hardware together. In recent years, they've even gone as far as to design their own processors. Apple's IOS software is written and compiled (translated into computer language) for their specific requirements, and a defined set of hardware. It is all designed to work together; hardware and software.

Google designed Android to be hardware agnostic. Anyone can use it. Samsung takes the Android platform and incorporates their own visual design, fonts, apps, etc. OnePlus does the same. LG used to do the same. Motorola does the same. Lenovo and Sony do the same. Interestingly, with the Google Pixel phones, Google seems to want to copy Apple to some degree. Where things get rough around the edges. from the software standpoint, is in the other maufacturers' customizations. Complicating things even more, Samsung, Motorola, Sony, Lenovo, etc. all use different hardware, processors, modems, cameras, screens, etc. Even worse, they may use different hardware for different price point phones. Each new piece of hardware is another variable that has to be incorporated into programming and the compiling of that programming into computer language (the ultimate release of the software that you install and use). Now you have:
- Different customizations of the base platform (Android)
- Different hardware
- Different firmware (like a driver) for that hardware
- etc.
All of these things, together, are what makes Android unique, but never quite as smooth and refined as Apple IOS. It also means that it takes longer for Samsung, Sony, etc. to release Android updates because they have to update their Android customizations, and compile the new platform for various hardware configurations.

This is one reason why Tesla is often compared to Apple. They have brought a lot of their hardware and all of their software development in house.

Other manufacturers use other systems as their software platform and none (or few) are building their own processors and other software related hardware.

Ford Sync started off being built on a Microsoft platform.
A few years later, Ford moved Sync from Microsoft to the Blackberry QNX platform, which it is still built on.
Ford has indicated that later this year they will move to Google Automotive as the base platform.
This also means that Ford has one group maintaining and updating Sync (QNX) and another group building out the Google Automotive platform.
(No word on whether Ford can or will update Sync QNX cars to Google Automotive.)

Regardless of the base platform, car manufactures will continue to struggle with software development and releases due to the nature of their manufacturing model. Ford, GM, Stellantis, etc. design the cars, but the parts are sourced from other manufacturers like Bosch, Borg-Warner, Nippon, Denso, etc. They source processors and other computer modules from other manufacturers as well. For example, the HVBJB defect was not a Ford defect. It was a defect from another manufacturers that Ford contracted to make that part.
Another example, a car manufacturer may source a Body Control Module (BCM) from multiple manufacturers. If that part is sourced from 3 possible companies, the BCM firmware must be written, compiled, and tested three times... once for each possible part used. This helped the build process during the chip shortages, but complicated things for the software side of the house. All said, each of these variables is an opportunity for failure or error.

As long as car manufacturers source the software and firmware related parts from multiple manufacturers, and they do not enforce strict design and firmware compatibility requirements, they will continue to have to do multiple times the work, requiring more people, and more time to push new and updated "Power-Ups" out.

On a Side Note
Last year, Apple shared a new version of CarPlay that is a complete user interface for all car system interaction. This expands CarPlay from the center screen to all screens, including speedometer, odometer, trip computer, fuel/charge status, etc.
Google and Apple seem to be more and more interested in taking over the car User Interface, but the underlying hardware issue remains, unless Apple or Google institute strict hardware requirements to use their solutions.
https://jalopnik.com/apples-next-carplay-update-will-run-your-whole-car-1849024124

superdave80 · Mar 21, 2023

Mirak said:
But how is Ford breaking stuff that previously worked?

I work in automation and do some basic programing. Many, many times have I introduced a fix for one problem, and then later see that it affected another piece of the program in a way I did not expect causing another issue. This is especially true when working on a program created by someone else.

ArthurDOB · Mar 21, 2023

Mirak said:
"I know I am probably going to regret the avalanche of nerdery that is about to ensue..."

This made me laugh!

Mirak · Mar 21, 2023

superdave80 said:
I work in automation and do some basic programing. Many, many times have I introduced a fix for one problem, and then later see that it affected another piece of the program in a way I did not expect causing another issue. This is especially true when working on a program created by someone else.

This is interesting to me how that happens. I view, perhaps ignorantly, a software interface as an assemblage of lots of different modules. If you make a tweak in one module, it shouldn't really impact any of the other modules, no? I guess that's an over-simplification. But I guess you could tweak a module that multiple functions depend upon, and that tweak could fix one function but break another?

Mirak · Mar 21, 2023

I totally get the explanations that certain things are higher priority. Just seems like after two years even lower priority stuff should get some attention.

I totally get the explanation that some things we view as bugs, the engineers might view as a feature. It's just tough to envision why the engineers would intend for...
- the car to always remote start to guest profile
- it is impossible to permanently turn off interior motion sensors even though they gave us a toggle in settings to do just that.

I totally get the explanation that certain bugs are tough to hunt down because they aren't easy to replicate. Like the One Pedal Driving fault or the "navigation to nowhere bug" that happens rarely. But things like radio station presets disappearing happens frequently enough that you'd think they could replicate that.

Then there's the broader process critique, which I'll just have to take your word for it. I don't know what the best practice is, just that Ford doesn't seem to be doing great.

Degrix · Mar 21, 2023

Mirak said:
This is interesting to me how that happens. I view, perhaps ignorantly, a software interface as an assemblage of lots of different modules. If you make a tweak in one module, it shouldn't really impact any of the other modules, no? I guess that's an over-simplification. But I guess you could tweak a module that multiple functions depend upon, and that tweak could fix one function but break another?

The problem with tweaking modules is that they aren't tightly linked to one another (which is generally a good thing) and shouldn't affect one another as long as the interface does not change; however, things can still break in non-obvious ways.

For example, let's say there's a module that prints a document. Everyone uses it, but one group (A) doesn't like that it prints upside down on their printer. Now, the print module hasn't been touched in ages, so they decide to just flip whatever they want to print upside down to trick the print module into printing the document the way they want it printed - still using the same interface. Then, some one on the printer team notices the upside down print issue and fixes it. Group (A) is now broken until they revert their workaround.

Now, the above example is ultimately the result of a bad process, but still, it happens

atomdeathstroke · Mar 21, 2023

**** DISCLAIMER**** I'm not a programmer or developer, or hardware or software engineer, but I work with a lot of them and I work in the manufacturing space with lots of complicated hardware and software.

There are many issues as to why your software is buggy, but I'm only going to focus on the main issue. Hardware fragmentation.

The hardware in every vehicle is very different. To make it even worse, the hardware in the same vehicles can also vary greatly. Ford like most OEMs, don't make most of their parts internally. Not only that, but the parts they do use can even come from multiple vendors and even a part from the same vendor could be different versions from time to time. This has gotten worse over the years due to rapid changes in the technologies but also because of material and supply limitations. Even regulatory changes can effect parts.

Imagine being on the software development team and you are tasked with updating the sensors to better allow the customer to use their phone as a key. From our perspective, it sounds like a simple task, but the reality is much worse. This developer or team of developers have to account for the software changes for not just a single system, but multiple systems, with many differing software revisions and firmware versions. A "simple" software update from ford might have to work with over a dozen varying hardware components that all do the same thing. So instead of a single piece of code, you now have 12 versions of that code to corelate to each hardware version of what you are trying to work with. Anyone of those versions could have their own glitches, and issues which could cascade across other systems, because of the high variability of the hardware in each vehicle.

This why companies like Apple and Tesla take a much different approach to this issue. They vertically integrate as much as possible and they also try and use a single piece of hardware across their entire product line.

Apple does this best with their iOS platform. I know with certainty, when look at an iphone of any model from that year, the software is 99.9% identical across all their devices. nearly all the hardware is identical as well (a few exceptions because of the sheer volumes they make)

Tesla is also working hard on that front. If you look at Tesla's vehicle recalls, nearly all of them could be fixed by a simple software patch and it's done within a day or so. They have this ability, because the hardware variations are so few. It's the same sensors, same computers, same cameras, cooling motors, etc.... across their entire product lines.

I'm not here to blow smoke up apple/teslas ass, but they are taking the correct approach. In fact, this system was already done 100 years ago....... BY FORD THEMSELVES.

It's also what Ford is trying to do and get back to their roots.

If you want to get more detail on this issue, Monroe and Associates have a BUNCH of videos talking about this and the benefits of limiting or controlling individual components to cut down on costs and complexity. Especially on the software and Q&A side.

astrorob · Mar 21, 2023

AlanJ said:
Developers will sometimes just comment out code (Instead of deleting it, they make it inactive by making it a comment in the lines.) This adds so much bloat that it can cause issues since the computer still have to go through it and load it. Almost every app developer does this, even though it's not the right way.

i agree that excessive commenting out of code makes the code hard to read, but if they are using compiled languages the commented code is not in the binary and the car’s computer does not have to (and simply can not) load it or execute it. the compiler ignores those lines and no machine instructions are emitted for the commented code.

interpreted languages are a different story - the interpreter sees those lines at runtime but in theory should do nothing when they are encountered.

superdave80 · Mar 21, 2023

Mirak said:
But I guess you could tweak a module that multiple functions depend upon, and that tweak could fix one function but break another?

As an example, I program controllers that go through steps, and each step turns on/off certain bits to keep track of where a program is in it's cycle. Now those bits can ALSO have other uses in other parts of the program. If I don't carefully check what other functions a bit might have, changing how/when those bits turn on/off could affect other parts of the program.

astrorob · Mar 21, 2023

also with respect to fixed bugs reappearing, many times a development group will "fork" their codebase. this might be done to add a huge new feature, so that it can be done "on the side" rather than messing up the main codebase.

they might then discover a bug that was common to the trunk of the fork and one group will fix the bug and the other won't. then at some point it comes time to merge the forks back into the mainline code and due to a merge mistake the old bug is brought back into the code.

you also need to have nightly regression tests on everything that was checked into the code database the day before in order to make sure that people don't inadvertently break something. writing these tests are a discipline unto themselves. the people writing the software should not also be writing the tests. this costs money in headcount.

software development is messy and you really have to be very disciplined and have an organizational flow and culture that lets you avoid these kinds of problems. there really need to be people whose job it is to oversee this stuff and manage the process independently of the developers.

Model	Ordered	Build Week E-Mail	Build Week	Built & Shipped	Delivered - MN
2023 Premium, Rapid Red, AWD, Light Space Grey Interior, Standard Range	11/25/2022 Updated: 12/30/2022 Updated: 1/31/2023	2/9/2023	4/17/2023 4/28/2023 4/25/2023	Built: 4/27/2023 Shipped: 4/28/2023	Delivered 5/22/2023

Programmers - why so many bugs?

Blue highway

Well-Known Member

mburtsvt

Well-Known Member

worachj

Well-Known Member

A-A-Ron

Well-Known Member

RobbertPatrison

Well-Known Member

DadzBoyz

Well-Known Member

superdave80

Well-Known Member

ArthurDOB

Well-Known Member

Mirak

Banned

Mirak

Banned

Degrix

New Member

atomdeathstroke

Well-Known Member

astrorob

Well-Known Member

superdave80

Well-Known Member

astrorob

Well-Known Member

Similar threads