Official LSV Web Site

Line break

Orchids Title
Real-time event analysis and temporal correlation for
intrusion detection in information systems.

Line break

[   Introduction   |   Presentation   | Features |   Sample   |   Team   |   References   ]

The main goal of this research project is to design and develop a prototype of a new on-line intrusion detection system (IDS), capable of analyzing and correlating events over time, in real-time. The main focus is to provide new correlation methods, based on an efficient model-checking of a temporal logic. This intrusion detection platform deals with many major problems, such as event correlation, temporal queries, query optimization, clock desynchronisation and accuracy, event flow multiplexing, multi-event handling, analyzer protection and intrusion detection language generalization.

The Orchids project started in December 2002, under the direction of Jean GOUBAULT-LARRECQ in the framework of the RNTL project DICO (Réseau National des Technologies Logicielles - Détection d'Intrusions Coopérative) started in December 2001, and the ACI Crypto PSI-Robuste project. Orchids is specialized in the intrusion detection by scenario recognition. It is developed by Julien OLIVAIN, a member of the SECSI Team, at the "Laboratoire Spécification et Vérification" (LSV CNRS UMR 8643). The SECSI Project is a research project on security of information systems (the name stands for, in French, "SÉCurité des Systèmes d'Information"). It is a common project of the INRIA Futurs research unit, and of the LSV at the "Ecole Normale Supérieure (ENS) de Cachan".

Key words: Intrusion detection systems (IDS), computer networks, information systems, security, attack scenario recognition.


Information systems are more powerful and bigger everyday. This is one of reasons why they become more and more vulnerable. Threats, stakes and risks are also growing. So the security of such information systems and intrusion detection are now very important.

One of the most greatest problem of the intrusion detection is the event correlation in the time. Thus, the word correlation has to be understood in the sense of 'sequence recognition'. It consists of recognizing event positions in the time, and relations between them. Many efficient tools exist, but they are often limited by their analysis method. Moreover, there are neither logic, nor specification languages really suitable for the real-time intrusion detection, since they require very specific features.

That's why the research and the design of a platform for unifying intrusion detection systems is proposed in this project. A logic for a specification language with many abstraction levels will be established. The core of the analysis engine is based on model-checking methods (The model-checking is a formal method that allow to verify properties such as termination, deadlock presence, or the set of reachable states from a model and a specification of a given reactive system). The detection will be preferably based on an misuse detection, for evident reasons of efficiency, but it does not exclude the opposite approach, the anomaly detection.


The Orchids detection system recognizes scenarios by simulating known finite automata, from a given event flow. This method allows the writing of powerful stateful rules suitable for intrusion detection.

The figure 1 shows the global architecture of the Orchids platform. It is composed of five main parts: a set of rule definitions (in a dedicated specification language), a rule compiler which translates rule definitions into an internal automata representation, a set compiled rules which is the knowledge base of the whole system, a massively parallel virtual machine which simulates non-deterministic finite automata, and a set of input modules which decodes data incoming from external sources.

Orchids architecture
Figure 1: Orchids global architecture

There are two kinds of sources : real-time and polled inputs. Real-time sources notify their events by themselves and for other sources, event data needs to be checked and retrieved periodically. This modularity facilitates the platform extension. Currently, Orchids includes 12 modules (see figure 2). An input module add new data fields which can be referenced in rule declarations. The generic module add a set of modules handling text messages of common programs (see figure 3).

Module name
Provides remote administration and debugging module.
Reads plain text log files, line by line.
Receives message in realtime via the UDP network protocol.
Reads data in the BSD Syslog format.
Receives binary messages from a modified version of Linux Snare.
Reads Linux Snare text logs.
Reads data from the Linux firewall log, NetFilter.
Provides programmable modules by regexp for text sources.
Receives network filtering information from Cisco equipments.
Queries SNMP equipments and receive traps.
Reads informations from Sun Basic Security Module audit files.
Reads events from the Microsoft Windows Event Logging system.
Figure 2: Orchids input modules.

Software name
AppleTalk Filing Protocol daemon - Netatalk
Anacron - anac(h)ronistic cron
Apache HTTP Server
Strong cryptography for the Apache HTTP Server
keep track of ethernet/ip address pairings
Automatic mounter
Daemon to execute scheduled commands
Common UNIX Printing System
dhcpd / dhclient
Dynamic Host Configuration Protocol Server/Client
Secure IMAP and POP3 server.
An entropy checker for ciphered network connections
IMAP server
linux kernel
Messages of the Linux Kernel and its drivers
A Domain Name System server
Network time protocol daemon
pam linux
Pluggable Authentication Modules for Linux
A tiny POP3 daemon
The PostFix mail transfert agent
SQL server
Network file system utilities
Remote shell server
Fast incremental file transfer utility
The Sendmail mail transfert agent
The Snort intrusion detection system
Secure shell client and server
Execute a command as another user
Trivial file transfer protocol server
Linux configurable dynamic device naming support
The extended Internet services daemon
Yellow Pages (NIS) tools
Yellow Pages (NIS) server
Figure 3: Software supported by the generic module.


Multi-event temporal correlation: this is the main feature of Orchids.  It can search for complex sequences of events in respect of given constraints on time, data or other sequences.
Report accuracy: Orchids can catch events that does not affect detection.  These informal events are included in the report to help the security administrator in his work.  These events help to answer questions such as 'What has the attacker done ? and how ?'. This is the case of the GET_REGS and ANY-SYSCALL events in the ptrace example.
Report simplification: Sometimes the accuracy is not required: for example, in a rule searching for a long sequence of the same events, only the first and the last are relevant; others can be hidden in the report summary. Orchids rules can tag events in reports with a level of pertinence.
Embedded SWI-Prolog interpreter: Orchids can use an embeded Prolog interpreter for maintaining a knowledge base and for performing some logical operations. This intepreter allows Orchids users to extend the detection language by writing Prolog modules. Prolog has multiple uses in Orchids: It represents the initial static knowledge of the system, such as the network topology, user accounts database, installed softwares, services configuration, etc. Another use is a dynamic knowledge base, interactivelly modified by rule activity: for example a blacklist is updated by rules to avoid to redetect an attacker that have already been identified. Another Prolog use is the resolution of 'complex' data aliasing. In a network, same data may have multiple representations. For example, this allows to deduce the relations between the mail address, the real name 'John SMITH', the user id 42, the username jsmith, the workstation ''. Prolog can also be used to deduce some trivial conclusions: e.g. a Windows attack attempt on a Linux server will not succeed. See small example. Bigger knowledge bases should use external tables or SQL database interface.
Automatic green cuts: Orchids represents detection rules as non-deterministic, epsilon-acyclic, finite automata. A green cut is an optimization that consists of removing some paths to explore, when other paths already found are optimal (in the shortest run meaning of this paper). The name green cut comes from Prolog and means that a cut have no side-effect: the result will be the same as if the cut never happened.
Automata with cuts: Orchids automaton can also include explicit red cuts: this name also come from Prolog (equivalent to the '!' Prolog operator). They have two uses:  the first is for rule optimization, and the second is for introducing a form a negation. These cuts have a side-effect and modifies the behaviour of the detection process. This operator must be used with extreme precaution.
Automata with real-time active timeouts: A real-time active timeout is a special automata transition that passes to another state at a specific time. The real-time scheduler awakes the analyzer engine, even if no event has been received. This allows the execution of actions at a specific time according to rule definitions. An example of this construction is a rule that checks that periodic jobs have been done or throws an alarm if no job has been seen. This allows to find an event nonexistence in a time period and to trig some actions as soon as possible. This is another form of negation.
Synchronization variables: Synchronization variable environments avoid to detect multiple parallel instance of a same rule according to synchronization variables. This limits the number of rule instance created. For example, a rule synchronized on a variable $attacker will detect only the first instance of attack attempts from the same attacker. Combined with a Prolog blacklist of attackers, this is a powerful tool against denial-of-service (DoS) attacks against Orchids.
Internal state viewer module: Orchids can output its internal state in html page. This feature is mainly for debugging purposes. See a small example.
Modularity: Orchids is fully extensible by its modularity. Modules can add new features such as new inputs and new language functions. There is no special limits other than cpu time and memory space, of course.
Extensible action language in rule specification:  Each state of detection can execute actions: this may be used in detection rules to make some preliminary report (send a mail alarm while the rule continue to track the attacker's operations). It can also be used to interract with the environment: block attacker by dynamically adding a rule on a firewall, disable suspicious resource such as an user account, add a dynamic rule in the Prolog knowledge base, make a report in a file, etc...
Temporal analysis module: Orchids includes a temporal analysis module. By adding a new function in the specification language, this module computes event category frequencies; more formerly, a periodogram that represents the statistical distribution of time gaps. This module also computes phases of events with some common predefined calendars such as seconds, hours, days, weeks, months and years. More calendars can also be defined. This module can also compute a Kullback Leibler distance between the current time window and the global frequency table; this gives a statistical indice of how normal is the current activity of an event category according to the frequencies learnt.
Realtime scheduler: The real time analyzer sleeps when it waits for an event reception. The real time scheduler can wake up the analyzer to execute some actions at defined dates.  Actions to execute are register by rules and modules, so it can be a wide range of actions, such as polling data, generating meta-events, flushing caches, automata timeouts, frequency computation, internal state viewer regeneration...
User definable rule preprocessor: The Orchids rule compiler uses a preprocessor defined in the configuration. The default preprocessor is cpp. Other preprocessors might be used, such as m4 which can be sometimes helpfull to achives some basic rewriting operation and introducing a higher level of language.
Clock imprecision: Monitoring a massivelly parallel system in order to recognize event sequences is a hard problem: events come from different places, treatment and routing time may vary from equipment to equipment, time representation may be different, more or less accurate and synchronization between all equipements may be more or less precise, depending directly on the synchronization method. The clock imprecision module maintains a knowledge base of different time references present in the system. It keeps a precision and a synchronization constraints of each known clocks and can inform the compiler and the correlation engine whether a strict or partial event ordering will occur. In case of partial ordering, the event sequence is uncertain due to clock imprecision, and a probability of the real order is computed. In the future, this module will be refined with the NTP clock monitoring functionality.
Misuse and anomaly detection: Misuse detection consists of looking for 'bad' action sequences. This was one of the first purposes of Orchids, at the beginning (see the ptrace example). Inversely, anomaly detection consists of describing a 'normal' behaviour, and searching for any sequence that does not fit with it. Orchids language allow to write such rules. A demonstration rule specifying Unix POSIX permission inheritance (via setuid(), seteuid() and setreuid() system calls) has successfully detected several Linux Kernel attacks and backdoors.
Generic intrusion detection platform: Orchids can efficiently correlate events of any type and any sources. It can be events from a host (operating system and kernel events), from the network (firewall, router, dedicated sensor, other IDS messages...) or from applications and services (mail, database, web sever, ...).
Successfully detected attacks: Linux Kernel attacks: ptrace() attack (BugTraq ID 7112), do_brk() attack (BID 9138), uslib() (BID 12190), mmap() mremap() munmap() attacks (BID 9686 and BID 9356), some kernel backdoor that violate user identifier and permissions. Apache/mod_SSL attack (BID 5363), OpenSSH channel attack (BID 4241).


Here is a real sample of an attack recognition rule to introduce Orchids capabilities. This example illustrates the detection of attacks on the Linux operating system kernel. This category of attacks consists of an exploitation of a missing permission check in the kernel, during a ptrace() system call. This attack is referenced as BugTraq reference BID 7112 and CVE reference CAN-2003-0127. An attack program is usually a small piece of code like the one shown in figure 4. In order to detect properly all variations of such programs, a model is used to proceed to the recognition.

 *  Author: snooq
 *  Date: 10 April 2003

#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/wait.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/ptrace.h>
#include <sys/socket.h>
#include <linux/user.h>

#define SIZE (sizeof(shellcode)-1)  

pid_t parent=0;
pid_t child=0;
pid_t k_child=0;
static int sigc=0;

char shellcode[]=  

void sigchld() {
void sigalrm() {

main(int argc, char *argv[]) {
 int i, error;
 pid_t pid;
 struct user_regs_struct regs;

 switch (pid=fork()) {
 case -1:
  perror("Can't fork(): ");

 case 0:
        && (errno==ESRCH)) {
   fprintf(stderr, ".");
  if (error==-1) {
  if (ptrace(PTRACE_SYSCALL,k_child,0,0)==-1) {
  if (ptrace(PTRACE_GETREGS,k_child,NULL,&regs)==-1) {
   perror("-> Unable to read registers: ");
  for (i=0; i<=SIZE; i+=4) {
             *(int*)(shellcode+i))) { }
  if (ptrace(PTRACE_DETACH,k_child,0,0)==-1) {
   perror("-> Unable to detach from modprobe: ");
  if (kill(parent,9)==-1) {
   perror("-> We survived??!!??  ");


Figure 4: A small piece of code exploiting the Linux ptrace() flaw. (Original file)

The model used for the recognition is an automaton, which allows a precise tracking of states, events, and their multiple meanings. The automaton used for the recognition of the Linux ptrace() attack is shown on figure 5. Here is a small description of the correlation engine working. The algorithm is very similar to those used for regular expressions' evaluation or parsing algorithms. The main difference is that Orchids uses non-deterministic automata, instead of deterministic ones. The beginning is the initial state q0. At this point, the system waits for a ptrace() call querying an attachment to a process, for any host, user and process identifier. When such an event occurs, it is caught and the system enters in the state q1, and waits for an execution of the modprobe program, or another action that assures a normal debugging operation. Normally, a ptrace(ATTACH) request should be only made on a running process ; this is the race condition (an attacker requests an attachment to a process which will be created in the near future). When the modprobe execution occurs, the system enters in the q2 and waits for ptrace(SYSCALL) request, which asks for the attached process to stop just before or after its next system call. The construction composed of states q3 and q4 waits for an optional event ptrace(GET-REGISTERS) followed by a ptrace(POKETEXT). Optional events are not significant for the detection since the rule can be matched even if they don't occur, but they are important for improving the report accuracy. The state q5 has a transition looping on itself ; this construction catches a sequence of event ptrace(POKETEXT) of any length (this sequence corresponds to the shellcode injection into the modprobe process by the attacker). So, the whole construction of states q3, q4 and q5 accepts a sequence beginning with an optional ptrace(GETREGS), followed by at least one ptrace(POKETEXT). When such a sequence is recognized, the correlation engine waits for a ptrace(DETACH) event (which is used by the attacker to resume the execution on the injected shellcode). When a ptrace(DETACH) is caught by the analyzer, we can conclude that a ptrace() attack has been made. At this point (state q6), a preliminary alert report can be established, and the attack response actions can be executed. In this example, the automaton keeps track of all actions that could have been possibly made by the attacker (here the shellcode) in the meantime between the detection and the reaction. This is assured by the construction of states q6 and q7, especially with the transition waiting for any system calls from a given host and process identifier.

Of course, in order to be sure of catching all instances and all interleaving of attacks, the correlation engine needs to backtrack on each reached state.

ptrace rule automaton
Figure 5: Automaton detecting Linux ptrace() attacks.

If such an automaton models properly the attack process with a certain level of generalization (right use of correlation variables, no catching of irrelevant events), rules can catch many variations of a flaw exploitation. The presented rule on figure 5 can detect variations of attacks of this flaw, and include in the report the actions made by shellcodes (see figure 6).

Attack 1
Attack 2
ptrace-kmoc.c (a chown()/chmod() shellcode)
myptrace.c (bind a shell on port tcp/24876)
Report summary:
attacker-pid=1605 ; target-pid=1606
[1605]SYS_ptrace(req=PTRACE_ATTACH,   pid=1606)
[1605]SYS_ptrace(req=PTRACE_SYSCALL,  pid=1606)
[1605]SYS_ptrace(req=PTRACE_GETREGS,  pid=1606)
[1605]SYS_ptrace(req=PTRACE_POKETEXT, pid=1606)
[1605]SYS_ptrace(req=PTRACE_DETACH,   pid=1606)

Report summary:
attacker-pid=1535 ; target-pid=1536
[1535]SYS_ptrace(req=PTRACE_ATTACH,   pid=1536)
[1535]SYS_ptrace(req=PTRACE_SYSCALL,  pid=1536)
[1535]SYS_ptrace(req=PTRACE_GETREGS,  pid=1536)
[1535]SYS_ptrace(req=PTRACE_POKETEXT, pid=1536)
[1535]SYS_ptrace(req=PTRACE_DETACH,   pid=1536)
[1536]SYS_kill(pid=1534, signal=SIGKILL)

Full attack report 1
Full attack report 2
Figure 6: Reports generated with the rule ptrace() for two different attacks.

To do the analogy between our multi-event recognition with automata and the mono-event detection scheme, the figure 7 shows mono-event rules represented as automata. Here, mono-event rules are automata composed of two states (exactly one initial and one final state), and one transition reprensenting the mono-event rule, which is totally independant of the time (and doesn't depend of any other rule, in the past or the future). Additionnally, alerts are triggered when a final state is reached.

Common rules
Figure 7: Automata representing the detection scheme by mono-events filtering.

People actually working in this project are members of SECSI Team.

Active members are :

Jean Goubault-Larrecq (In charge, Full professor, ENS Cachan)
Stéphane Demri
(Full-time Researcher, CNRS)
Julien Olivain (Project engineer, INRIA)
Hedi Benzina (Project engineer, CNRS)
Ahmad Fliti
(Project engineer, CNRS)

SECSI is a common project of the INRIA Futurs research unit, of the LSV (CNRS UMR 8643), at the ENS Cachan. See also the project page at INRIA.


J. Olivain and J. Goubault-Larrecq. The Orchids Intrusion Detection Tool. In Proceedings of the 17th International Conference on Computer Aided Verification (CAV'05), Edinburgh, Scotland, UK, July 2005, LNCS 3576, pages 286-290. Springer.
J. Goubault-Larrecq. Un algorithme pour l'analyse de logs. RR LSV-02-18, Lab. Specification and Verification, ENS de Cachan, Cachan, France, November 2002. 33 pages.
Muriel Roger and Jean Goubault-Larrecq. Log auditing through model checking. pages 220-236, 2001.

Related intrusion detection research papers :
Mark Crosbie, Bryn Dole, Todd Ellis, Ivan Krsul, and Eugene Spafford. IDIOT - user guide. Technical Report TR-96-050, Purdue University, West Lafayette, IN, US, September 1996.
Vern Paxson. Bro: a system for detecting network intruders in real-time. Computer Networks (Amsterdam, Netherlands: 1999), 31(23-24):2435-2463, 1999.
Abdelaziz Mounji, Baudouin Le Charlier, Denis Zampunieris, and Naji Habra. Preliminary report on distributed asax.
S.T. Eckmann, G. Vigna, and R.A. Kemmerer. STATL: An Attack Language for State-based Intrusion Detection. Journal of Computer Security, 10(1/2):71-104, 2002.
Giovanni Vigna and Richard A. Kemmerer. Netstat: A network-based intrusion detection approach. In ACSAC, pages 25-, 1998.
M2D2 : A Formal Data Model for IDS Alert Correlation (with B. Morin, L. Mé and H. Debar) RAID 2002 (Recent Advances in Intrusion Detection). Springer-Verlag, Lecture Notes in Computer Science.
Discovering chronicles with numerical time constraints from alarm logs for monitoring dynamic systems. Christophe Dousson, Thang Vu Duong. In proc. of the 16th IJCAI (pp. 620-626) Stocklom, Suède, Août 99.
Ludovic Mé. GASSATA, A Genetic Algorithm as an Alternative Tool for Security Audit Trails Analysis. First international workshop on the Recent Advances in Intrusion Detection (RAID98). September 14-16, 1998. Louvain-la-Neuve, Belgium.
Cédric Michel and Ludovic Mé. ADeLe: an Attack Description Language for Knowledge-based Intrusion Detection. In Proceedings of the 16th International Conference on Information Security. Kluwer. June 2001.
J.-P. Pouzol and M. Ducassé. Formal specification of intrusion signatures and detection rules. 15th IEEE Computer Security Foudations Workshop (CSFW'02). June 2002.

Additionnal related research papers:
J. F. Allen. Maintaining knowledge about temporal intervals. In D. S. Weld and J. de Kleer, editors, Readings in Qualitative Reasoning about Physical Systems, pages 361-372. Kaufmann, San Mateo, CA, 1990.
J. F. Allen. Towards a general theory of action and time. In J. Allen, J. Hendler, and A. Tate, editors, Readings in Planning, pages 464-479. Kaufmann, San Mateo, CA, 1990.
James F. Allen. Time and time again: the many ways to represent time. International Journal of Intelligent Systems, 6:341-355, 1991.
James F. Allen and George Ferguson. Actions and events in interval temporal logic. Technical Report TR521, 1994.
G. Vigna, S.T. Eckmann, and R.A. Kemmerer. Attack Languages. In Proceedings of the IEEE Information Survivability Workshop, Boston, MA, October 2000.

Related technical documents:
D. A. Curry and H. Debar. Internet draft - intrusion detection message exchange format (idmef) data model and extensible markup language (xml) ducument type definition, March 2007.
R. Bace and P. Mell. Intrusion detection systems. Technical Report Special Publication 800-31, National Institute of Standards and Technology (NIST), 2001.
Sun Microsystems. Solaris System Administration Guide: Security Services.
Microsoft. Windows Event Logging System.
RFC 1157 - Simple Network Management Protocol (SNMP)
RFC 1905 - Protocol Operations for Version 2 of the Simple Network Management Protocol (SNMPv2)
The Common Intrusion Detection Framework (CIDF), 1999.

Public attack lists and databases:
ICAT Metabase
CVE: Common Vulnerabilities and Exposures
CERT/CC Vulnerability Notes Database
SecurityFocus BugTraq
CIAC: Computer Incident Advisory Capability

Line break

About LSV