编写在高放射性环境中使用的应用程序

We are compiling an embedded C/C++ application that is deployed in a shielded device in an environment bombarded with ionizing radiation. We are using GCC and cross-compiling for ARM. When deployed, our application generates some erroneous data and crashes more often than we would like. The hardware is designed for this environment, and our application has run on this platform for several years.

Are there changes we can make to our code, or compile-time improvements that can be made to identify/correct soft errors and memory-corruption caused by single event upsets? Have any other developers had success in reducing the harmful effects of soft errors on a long-running application?

转载于:https://stackoverflow.com/questions/36827659/compiling-an-application-for-use-in-highly-radioactive-environments

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

23条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
lrony* 2016-04-25 02:58
关注
Working for about 4-5 years with software/firmware development and environment testing of miniaturized satellites*, I would like to share my experience here.

*(miniaturized satellites are a lot more prone to single event upsets than bigger satellites due to its relatively small, limited sizes for its electronic components)

To be very concise and direct: there is no mechanism to recover from detectable, erroneous situation by the software/firmware itself without, at least, one copy of minimum working version of the software/firmware somewhere for recovery purpose - and with the hardware supporting the recovery (functional).

Now, this situation is normally handled both in the hardware and software level. Here, as you request, I will share what we can do in the software level.

...recovery purpose.... Provide ability to update/recompile/reflash your software/firmware in real environment. This is an almost must-have feature for any software/firmware in highly ionized environment. Without this, you could have redundant software/hardware as many as you want but at one point, they are all going to blow up. So, prepare this feature!

...minimum working version... Have responsive, multiple copies, minimum version of the software/firmware in your code. This is like Safe mode in Windows. Instead of having only one, fully functional version of your software, have multiple copies of the minimum version of your software/firmware. The minimum copy will usually having much less size than the full copy and almost always have only the following two or three features:

capable of listening to command from external system,

capable of updating the current software/firmware,

capable of monitoring the basic operation's housekeeping data.

...copy... somewhere... Have redundant software/firmware somewhere.

You could, with or without redundant hardware, try to have redundant software/firmware in your ARM uC. This is normally done by having two or more identical software/firmware in separate addresses which sending heartbeat to each other - but only one will be active at a time. If one or more software/firmware is known to be unresponsive, switch to the other software/firmware. The benefit of using this approach is we can have functional replacement immediately after an error occurs - without any contact with whatever external system/party who is responsible to detect and to repair the error (in satellite case, it is usually the Mission Control Centre (MCC)).

Strictly speaking, without redundant hardware, the disadvantage of doing this is you actually cannot eliminate all single point of failures. At the very least, you will still have one single point of failure, which is the switch itself (or often the beginning of the code). Nevertheless, for a device limited by size in a highly ionized environment (such as pico/femto satellites), the reduction of the single point of failures to one point without additional hardware will still be worth considering. Somemore, the piece of code for the switching would certainly be much less than the code for the whole program - significantly reducing the risk of getting Single Event in it.

But if you are not doing this, you should have at least one copy in your external system which can come in contact with the device and update the software/firmware (in the satellite case, it is again the mission control centre).

You could also have the copy in your permanent memory storage in your device which can be triggered to restore the running system's software/firmware

...detectable erroneous situation.. The error must be detectable, usually by the hardware error correction/detection circuit or by a small piece of code for error correction/detection. It is best to put such code small, multiple, and independent from the main software/firmware. Its main task is only for checking/correcting. If the hardware circuit/firmware is reliable (such as it is more radiation hardened than the rests - or having multiple circuits/logics), then you might consider making error-correction with it. But if it is not, it is better to make it as error-detection. The correction can be by external system/device. For the error correction, you could consider making use of a basic error correction algorithm like Hamming/Golay23, because they can be implemented more easily both in the circuit/software. But it ultimately depends on your team's capability. For error detection, normally CRC is used.

...hardware supporting the recovery Now, comes to the most difficult aspect on this issue. Ultimately, the recovery requires the hardware which is responsible for the recovery to be at least functional. If the hardware is permanently broken (normally happen after its Total ionizing dose reaches certain level), then there is (sadly) no way for the software to help in recovery. Thus, hardware is rightly the utmost importance concern for a device exposed to high radiation level (such as satellite).

In addition to the suggestion for above anticipating firmware's error due to single event upset, I would also like to suggest you to have:

Error detection and/or error correction algorithm in the inter-subsystem communication protocol. This is another almost must have in order to avoid incomplete/wrong signals received from other system

Filter in your ADC reading. Do not use the ADC reading directly. Filter it by median filter, mean filter, or any other filters - never trust single reading value. Sample more, not less - reasonably.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(22条)

报告相同问题？

关注问题

scratch少儿编程逻辑思维游戏源码-放射性南瓜田.zip
2025-05-03 20:55

例如，玩家可能需要编写一个程序，使得角色在接触到放射性物质后自动收集南瓜，并将它们运送到指定位置，整个过程需要玩家合理安排代码的顺序和逻辑，以确保游戏的顺利进行。此外，这款游戏还具有很强的互动性，...
编程语言中的错误与程序的正确性
2025-05-23 19:16

隔壁王医生的博客本章节介绍了编程语言中的语法错误、静态语义错误及其重要性。阐述了各种错误类型对程序行为的影响，并探讨了如何编写能够明确指出问题的程序。同时，介绍了Python编程语言的特点，以及如何使用IDE进行编程。
编译用于高放射性环境的应用程序
2020-03-31 16:10

p15097962069的博客 #1楼参考：https://stackoom.com/question/2UWYV/编译用于高放射性环境的应用程序 #2楼 NASA has a paper on radiation-hardened software. 美国国家航空航天局（NASA）发表了一篇关于辐射增强软件的论文。 It ...
MATLAB程序在实现Gammex激光灯定位系统自动确认Raystation计划系统中患者治疗等中心中的应用.pdf
2021-06-27 17:35

研究中的方法部分详细说明了如何使用MATLAB程序设计软件编写应用程序，具体来说，是通过读取Raystation计划系统生成的RTstructure文件来获取放射治疗患者参考点和射野中心点的坐标。然后，计算这两点之间的位移，并...
编程实战：基于JAVA3D的网络三维技术的设计与实现(源代码+文档).zip
2024-07-14 08:47

2.渲染器使用已知位置和方向计算出要使用的观察对象，对终端用户物理环境的描述确定用户在物理环境中的位置和方向。为什么使用一个新的模式，由于在底层的编程接口中可以找到基于照相机的观察模式，开发者通过它...
4、软件开发中的防御性编程与测试
2025-09-18 04:05

rust6ferris的博客通过分析伦敦救护车调度系统和Therac-25放射治疗机等失败案例，强调了在代码设计中采用防御性原则（如不信任输入、处理异常、保持简洁）的必要性。文章详细介绍了功能测试、性能测试、回归测试等多种测试类型，并...
java应用程序编程_教孩子们Java编程
2020-07-08 13:02

cunfu6353的博客 java应用程序编程十二年前，我的小儿子戴夫（Dave）拿着我的Java教程出现在我的办公室。戴夫让我教他编程，以便他可以制作计算机游戏。到那时，我已经写了几本关于Java的书，并教过许多计算机编程课程，但是那是...
参考资料-基于89S51单片机的放射性剂量仪研制.zip
2022-01-22 01:14

这款基于89S51单片机的放射性剂量仪是专为检测环境中放射性强度而设计的智能设备，它融合了电子技术、嵌入式硬件和软件编程等多个领域的知识。89S51单片机是Microchip公司的一款经典8位微控制器，以其高性能和广泛的...
AI大模型应用入门实战与进阶：大模型在医疗影像分析中的应用
2024-01-09 01:16

光子AI的博客随着人工智能技术的不断发展，大模型在医疗影像分析中的应用也逐渐成为一种主流方法。这篇文章将从入门级别到进阶级别，详细介绍大模型在医疗影像分析中的应用，包括背景、核心概念、算法原理、具体操作步骤、代码...
铀矿勘查中地表探矿工程成果信息化研究与实践——放射性测量数据自动化成图.pdf
2021-09-16 01:04

软件开发过程中，首先通过需求分析确定软件的整体功能，然后利用计算机编程语言VB进行程序编写，形成软件化系统。辅助软件能够将多参数坐标转换为标准坐标格式，支持Surfer软件的数据网格化操作界面，进而实现了放射...
cesium无人机雷达放射波扫描
2026-01-03 19:54

此外，现代技术还使得cesium无人机雷达放射波扫描与javascript编程语言相结合。javascript作为一种广泛应用于网页开发的脚本语言，其在数据处理和自动化控制方面有着天然的优势。通过编写javascript程序，可以实现对...
CONTAM Web应用实战：化学与放射性事故模拟分析
2025-06-14 18:58

Love Snape的博客本文还有配套的精品资源，点击获取简介：CONTAM软件工具专门用于化学与放射性物质释放事故的模拟与分析，尤其在核能、环境科学和安全领域至关重要。将CONTAM与JavaScript结合，可以构建交互式的Web应用程序，实现...
烟雾传感器编程-下载即用.zip
2026-03-12 11:08

C51编程语言具有简洁明了的特点，非常适合编写控制程序。在开发流程中，需要设置中断服务程序来响应传感器的信号，同时配置串行通信端口（例如UART）进行数据交换。详细的实现流程包含：1. 设定C51单片机的输入输出...
56、句子语气与使用类比及放射性废物处理决策规则探讨
2025-09-26 10:40

HH234的博客本文探讨了两个核心领域：一是语言交流中句子语气与‘力设定者’在人工智能对话系统中的应用，强调需通过实际论证数据集构建算法，使AI能适应人类语言并辅助人类推理；二是放射性废物处理的决策机制，比较了寻求共识...
人工智能在商业中的应用
2025-05-28 18:10

D998998998的博客人工智能正深刻改变商业与生活，在医疗、电商、人力资源等领域广泛应用。医疗方面辅助诊断、机器人手术等提升效率；电商通过推荐系统、智能客服优化体验。生活场景如地图导航、语音助手等也广泛渗透。AI人才需求激增...
工业机器人在轮胎成型机上的使用.pdf
2021-08-14 16:23

特别是在高温、高压、低温、低压、有灰尘、噪声、臭味、有放射性或有其他毒性污染以及工作空间狭窄等恶劣环境下的作业，工业机器人的应用能够大大降低工人暴露于这些环境中的风险，减少工作场所的污染率。...
Python 在医疗诊断 AI 中的应用
2025-03-29 19:14

感知的边界的博客 ``html Python 在医疗诊断 AI 中的应用。
一定要知道PHP中反射的强大应用
2023-06-21 07:08

黑夜开发者的博客反射是一种编程语言的特性，它允许程序在运行时获取和操作对象的信息，包括类、方法、属性等。通过反射，程序可以动态地创建、修改和调用对象，从而实现更加灵活和智能化的功能。
SurfaceGatingDIBH-FKUI：用于从arduino绘制序列图的桌面应用程序。显示患者的呼吸频率
2021-02-12 23:46

标签“C#”提示我们这个桌面应用程序是使用C#编程语言开发的。C#是一种面向对象的、类型安全的编程语言，常用于构建Windows桌面应用、游戏开发、Web应用和服务等。它的.NET框架提供了丰富的类库，支持快速开发，同时...
DICOM格式数据的自编程序显示和处理实验.ppt
2025-04-21 11:13

实验中使用了如文本十六进制编辑器等软件工具来观察和编辑图像文件，以及通过编程语言环境来处理DICOM格式数据。实验步骤中，首先需要读取图像文件，分析文件头信息和数据元素，然后使用自编程序来读取、分析并...
没有解决我的问题, 去提问

编写在高放射性环境中使用的应用程序

23条回答 默认 最新

23条回答默认最新