A large language model (LLM) is a type of AI that is trained to understand and generate human-like text. It learns the structure, grammar, and semantics of a language by processing vast amounts of textual data (documents, wiki pages, social media posts etc.). LLM is what will automate your CE documentation production.
Conventional engineering software has been aiding engineers with numbers and graphics, leaving the nuanced textual information to engineers to deal with manually - no automation was possible. With the advent of AI/LLM technology that gap has been bridged.
The primary goal of LLM is to predict the probability of the next word or sequence of words in a given context. This is akin to a much simpler and very familiar technology such as the bar of completion words that appears above the keypad when one is typing out a text message on a smartphone.
Thus, at its core, LLMs are capable of one thing—completing text. The input into the model is called the prompt—it is a document, or block of text, that we expect the model to complete. Prompt engineering is the practice of crafting the prompt so that its completion contains the information required to address the problem at hand. The LLM user normally interacts with some LLM application/interface (e.g. Microsoft Copilot, Google Germini) that processes the prompt for the LLM engine and presents back the results of the LLM action on the prompt.
LLM may generate hallucinations. Hallucinations are factually wrong but plausible-looking pieces of information produced confidently by LLM. If the prompt references something that does not exist, an LLM will likely continue to assume its existence and hallucinate.
A typical approach to detect and mitigate hallucinations is to get LLM to provide some background information that can be checked. That could be an explanation of its reasoning, a calculation that can be performed independently, a source link, or keywords and details that can be searched for.
Other techniques to reduce hallucinations are outlined below.
The goal is to create a prompt so that its completion contains information that can be used to address the user’s problem. To achieve this, the following basic principles of prompt engineering apply:
The prompt must closely resemble content from the training set. That is, the more similar the prompt is to documents from the training set, the more likely it is that the completion will be predictable and stable. In other words, the user should always mimic common patterns found in training data. If the user wants to see what kinds of documents the model is familiar with, then the easiest way is to ask the model (e.g., prompt “what types of formal documents are useful for documenting a risk assessment for a backup diesel generator”). The user should see a large selection of documents to pattern their request after. Next, the use can ask the model to generate an example document and see if it is what they need.
The prompt must include all the information relevant to addressing the user’s problem. That is, the user must collect all of the information (context) relevant to solving the problem and incorporate it into the prompt. Note, the incorporated context must be strongly relevant (down to single words) to avoid distracting the model from generating relevant completion.
The prompt must lead the model to generate a completion that addresses the problem. That is, the prompt must condition the model towards the type of response the user hopes to see. The prompt may include example responses for the model to capture the pattern.
LLM prompting techniques fall into a few families that help you reduce hallucinations, improve reasoning quality, and increase reproducibility. Each technique shapes how the model thinks, structures information, or checks its own output.
Few‑shot prompting improves accuracy by showing the model 1–3 examples of the pattern you want it to follow.
It works because LLMs are pattern learners: they imitate the structure, tone, and logic of the examples.
What it’s best for
Structured outputs (risk tables, checklists, DoC templates)
Classification tasks
Style imitation
Reducing hallucinations by anchoring the model to real examples
Example 1:
Input: “Electric drill” → Output: “LVD + EMC apply; hazards: electric shock, overheating, mechanical injury.”
Example 2:
Input: “Toy robot” → Output: “Toy Safety Directive; hazards: choking, chemicals, flammability.”
Chain‑of‑thought prompting instructs the model to show its reasoning steps before giving the final answer.
This improves correctness on multi‑step tasks and reduces shallow or shortcut answers.
What it’s best for
Risk assessments
Mapping essential requirements to hazards
Conformity‑assessment logic
Standards selection
Example
“Explain step‑by‑step which EU directives apply to this machine.
Step 1: extract key characteristics.
Step 2: map characteristics to directive scopes.
Step 3: justify applicability.
Step 4: give the final list.”
Atom‑of‑thought prompting breaks reasoning into very small, verifiable micro‑steps.
It is more granular than chain‑of‑thought and reduces hallucinations by forcing the model to reason slowly and explicitly.
What it’s best for
High‑risk machinery classification
Hazard identification
Standards mapping
Legal interpretation
Example
“Analyse this machinery using atom‑of‑thought reasoning:
Identify each functional subsystem.
Identify hazards per subsystem.
Map each hazard to Annex III clauses.
Map each clause to evidence needed.”
Self‑checking prompts ask the model to audit its own output, reducing omissions and hallucinations.
Common patterns
Completeness check
“List any essential requirements that might apply but were not included.”
Consistency check
“Check whether the DoC, risk assessment, and standards list contradict each other.”
Error‑finding
“Identify any statements that may be incorrect or unsupported.”
Example
“Provide the answer, then perform a self‑check listing any missing hazard categories or any assumptions that require verification.”
1. Instruction‑scoped prompting
Constrain the model tightly:
“Use only information from Regulation 2023/1230 Annex III.”
“Do not infer requirements not explicitly stated.”
This reduces hallucinations by limiting the model’s freedom.
2. Schema‑based prompting
Provide a fixed structure the model must fill in:
Hazard → Risk → Mitigation → Evidence
Requirement → Standard → Test → Documentation
Schemas reduce variability and force completeness.
3. Cite‑your‑source prompting
Ask the model to identify where each claim comes from:
“For each requirement, state whether it comes from Annex III, a harmonised standard, or general engineering practice.”
This reduces invented requirements.
4. Counterfactual prompting
Ask the model to challenge itself:
“List reasons why your classification might be wrong.”
This exposes weak reasoning.
5. Role‑based prompting
Assign the model a specific expert role:
“Act as an EU machinery‑safety assessor.”
This improves domain consistency.