Opened 11 years ago

Last modified 8 years ago

#525 accepted enhancement

Service manager

Reported by: Martin Decky Owned by: Michal Koutny
Priority: major Milestone:
Component: helenos/app/init Version: mainline
Keywords: gsoc13, gsoc14 Cc:
Blocker for: Depends on:
See also: #414, #443, #184

Description (last modified by Jiri Svoboda)

Design and implement a replacement for the current init task, i. e. a full-fledged service manager.

Details
The current init task in HelenOS is very roughly equivalent to the Unix init process. Its purpose is to coordinate the bootstrap sequence in user space, start all necessary services and components of the user session. However, the current init is even more rudimentary than the traditional SysV init: it is a more-or-less sequential program that performs some hard-coded actions with only limited and mostly compile-time configuration options.

The goal of this ticket is to design and implement a general bootstrap and service managing infrastructure for HelenOS that would allow a flexible and declarative way of configuring the bootstrap process and services, their run-time monitoring, starting, stopping and (possibly automatic) restarting. While using different implementation means, the infrastructure should provide similar features as systemd for Linux, including (but not limited to) dependency management and concurrent execution of independent tasks.

The goal is not to port systemd to HelenOS, but to reimplement from scratch its non-trivial features.

What Gains and Benefits will this bring?
Even in monolithic systems the importance of a reliable service managing infrastructure is becoming more and more obvious. In a microkernel system, where even the basic system functionality depends on many components and services, a reliable and configurable service management is even more important. This ticket will enable HelenOS to move past the rudimentary hard-coded service configuration to a more flexible open infrastructure.
Difficulty
Medium
Required skills
A successful applicant will have good skills of programming in the C languages and the ability to survive in a non-standard non-POSIX application environment.
Documentation
Possible mentors
HelenOS Core Team, Martin Decky

Change History (10)

comment:1 by Jiri Svoboda, 11 years ago

Description: modified (diff)
Summary: System daemonService manager

comment:2 by Jiri Svoboda, 11 years ago

A series of articles on systemd in Czech: http://www.abclinuxu.cz/serialy/systemd

comment:3 by Martin Decky, 10 years ago

Keywords: gsoc14 added

comment:4 by Jakub Jermář, 10 years ago

Milestone: 0.5.10.5.2

comment:5 by Michal Koutny, 9 years ago

Owner: set to Michal Koutny
Status: newaccepted

Note: I converged to some ideas, that are similar/inspired by systemd. Therefore I try to stick to their terminology.

Dependency resolution

Brief summary of approaches to dependency resolution. Abstraction of entities that are either satisfied or unsatisfied and there exist actions to change this state for each entity. Let's call entities units right ahead (in detail explained below).

None (static order)

  • just ordered list of units (~inittab)
  • trivial implementation
  • easy administration (insert to proper position)
  • needs separate lists per target (runlevel)
  • notion of dependency closure is a bit skewed — anything following chosen unit
    • limits dynamic operations with units
  • cannot bridge the simplicity by shell scripting (so far in HelenOS)

Poset (partially ordered set)

  • each unit specifies its prerequisities (~Makefile)
  • sustainable administration: explicit dependencies, hierarchy-scalability (with virtual units)
  • fine grained closures
  • must specify dependencies manually
  • poset is static (cannot react to events, remember make)

Event engine

  • satisfying a unit emits an event, can react to that
    • needs event aggregation to proper resolution (on: eventA and event B)
  • holy grail of dynamics (if it emits event, you can react to it, even more complexly than satisfying a unit)
  • administration burden (naming and documenting events, IMHO)
  • error-prone implementation (event broadcasting, synchronization; IMHO)

Dataflow variables

  • every dependency stemms from a shared object
    • synchronization can be delegated to OS
      • ((mainly) sockets in UNIX)
      • IPC in HelenOS
  • implicit dependencies ("most natural")
  • followin two problems are exclusive:
    • no target distinction (start all, dependencies will be resolved)
    • not sure about satisfiability (could hang forever, needs verify explicit dependencies)
  • killing (stopping) problem
    • cannot just "kill all"
    • would need "graceful stop all" (unit correctly terminates only after it knows, it's not used by its dependants)
      • UNIX can overcome this with filesystem sockets that outlive cooperating subjects
  • could exploit HelenOS IPC and async framework
  • potentially no need of maintaining explicit dependencies
  • not sure whether there cannot exist HelenOS workflow with cycle (deadlock), currently "it boots"
  • no control of determinism
  • cannot synchronize on mountpoints

Summary

With some assumptions all these representations are equivalent. Thinking about transformations points out technical specialities. (Event→poset is most challenging, I hope I did not make any fundamental mistakes.)

I think that combining poset and dataflow variables approach in order to mitigate their respective disadvanteges could be suitable for HelenOS (which is IMO what systemd actually does).

Types of units

Here is a list of unit types that are necessary to achieve behavior equivalent to current state (with respect to HelenOS startup).

Kernel service

(Name is just a suggestion.) The tasks started directly by kernel.

  • infrastructure: ns, vfs, loader, (locsrv)
  • initial filesystem: (locsrv), rd, RDFMT fs
  • early coordinator: einit (I'm still thinking about level of separation/sharing with actual init)

Service

Any server task started from uspace.

Mounted filesystem

Mounted (i.e. initialized and present in VFS tree, #4471) filesystem ready for use.

Units of abovementioned types can be composed to achieve current-equal behavior. Below are other types of units that would allow more precise system monitoring and control.

Target

Just defines dependencies on other units. Allows multiple environment configurations and switching between them.

Device

Should represent an entity in locfs. Not sure whether it's really necessary as we can have blocking waitings on loc_service_get_id.

Driver

Motivation — system daemon should be aware of all long-running non-user tasks. For legacy drivers it would be mere service labeling, though DDF drivers would involve some coordination with device manager.

  • simply change task_spawn when starting driver in devman to system_start_driver, which would be just API of the system daemon for starting driver-labelled service
  • greater separation — devman won't keep state of the driver, it'll just call that system_start_driver function and system deamon would ensure singleton nature of the driver
  • though, I was not currently able to find any "task" feedback from the driver to devman
    • what if driver fails, is it devman's or system daemon's business
  • stopping drivers
    • surely they can't be just killed
    • should devman know about it (further, it could event initialize stopping of unused drivers)

Virtual service

Motivation: alternatives (assumes dependants refer to a well known general shared object, e.g. port 80 or category device, otherwise no alternatives would be needed; consider just different targets).

Job

Transient single-run task. Possibly with a schedule.

Custom

Arbitrary extensions in future when script interpreting will be possible.

Task state monitoring

In order to manage the system, system daemon would need to know state of relevant tasks.

Differential view

System daemon would need to know about termination (currently possible with task_wait) and spawning of its related tasks. Spawning information could be obtained from naming service via (new) function like task_monitor_spawn(task_id_t pid, ...) or tracing SYS_PROGRAM_SPAWN_LOADER syscall (though some work would be necessary to obtain ID of spawned task).

In this approach system daemon would keep the process tree.

Integral view

In this case system daemons would use functions like:

  • waiting for any task in a subtree,
  • kill all tasks from a subtree.

Someone else would keep the tree information. Naming service is a suitable candidate (though it would violate single responsibility principle).

"Sideview"

This would require kernel modification in order to add an abstraction of task groups (similar terms are: process groups, Solaris contracts, cgroups). Reason for putting such a responsibility to the microkernel would be possible usage for groupwise resource control (CPU, memory). Using this for controlling task groups might be considered unintended or even inappropriate. Not mentioning difficulty of such a change.

Non-task units

Mounted filesystem

System daemon needn't be aware of any mounted filesystem (IMHO), however, user can potentially unmount a unit-comprising filesystem or mount it (e.g. one that is not in current target). AFAIK this could only be achieved by modifying VFS server.

Device

We can monitor a category for added devices, though I didn't find functions for symmetric event. However, I'm not really sure this type of unit is actually needed (as mentioned above).

Semantics of stop

In some usecases killing whole subtree may be preferable, in others killing only master task would be desired. Thus it should be configurable and would use abovementioned task state monitoring.

Similarly sometimes immediate termination of unresponsive service is better than request for a graceful termination elsewhen. The system daemon should provide API to register stopping handler and considered service would be asked via IPC to terminate (and later possibly killed). Thanks to knowledge of dependencies this could be used to stop whole set of services (I'm optimistic about #414).

Later

Dynamicity

All in all, system daemon would react to asynchronous events (Task state monitoring). This would essentially mean to couple these events with some control commands (e.g. start service).

Configuration

Modification, serialization, where is it persisted.

comment:6 by Michal Koutny, 9 years ago

Extending location service to be the service manager, helps to solve some of the abovementioned topics.

Virtual services

  • They could be existing categories.
  • Though locsrv should know how to start at least one task that would fill the category (so that the having-alternative motivation is fulfilled).

Dynamicity

  • Can be achieved by monitoring a category.

Task state monitoring

With locsrv having sesion with each server, it can monitor tasks by phone hangup. However, we would lack state of service tasks that do not communicate with locsrv (spawned by original servers).

comment:7 by Michal Koutny, 9 years ago

After exploring internals of some HelenOS services, I've refined my ideas and matched them better for implementation in the HelenOS environment.

Entities

Unit

  • they are what is configured and controlled
  • they bear the logic state: started, stopped

Task

  • nothing special, task in standard HelenOS sense

Exposee

  • unit's interaction point
  • unit can have multiple exposees

Component

  • M:N mapping between tasks and units
    • unit has component for each its task (e.g. spawned tasks)
    • task has component for each of its units (e.g. vfs and mounts)

Responsibilities

Broker

  • resolves exposee to a handle
  • mediates connections (to a component) via handle
    • autostart option
  • API for (un)registration of exposees

Restarter

  • knows how to start a unit
  • knows how to stop a unit
  • monitors (and keeps) unit's state
    • in ideal world unit's state = "sum" of start/stop operations
    • real world
      • components (tasks) — must monitor task state and propagate it to unit's state
      • exposees — unit may be operational only with exposees — restarter must communicate with broker

(Dependency) resolver

  • autostart is not redemptive, explicit dependencies are still needed
    • fake exposees (mount points)
    • dependency only units (targets)
  • basic structure is dependency graph (should be DAG)
    • units are vertices
    • edge A—>B
      • B must be started prior A (is started)
      • B was requested by A (implicit edge, e.g. autostart)
      • B can only be stopped after A (is stopped) (for correct termination of A)
  • as dependecies are global (any on any) resolver has to be global as well
  • resolver resolves the dependencies and uses restarter(s) to change state of units
    • on the other hand, restarter's also notify resolver about unit state changes
  • broker notifies resolver about mediated connections in order to create implicit edges
    • implicit edges persist (broker doesn't know about disconnections), however they are "meaningless" when depending unit is stopped

Current state

Brokers

  • ns (exposee to handle mapping is identity)
  • locsrv (exposee is namespace/name)
  • devman (exposee is a device tree path)
  • vfs (exposee is a mountpoint)

Restarters

  • devman knows how to start a driver
  • init knows (hardcoded) how to start anything

Resolvers

  • init knows (hardcoded) order of "units"

Proposed solution

  • keep brokers doing their work
    • and only modify them to provide information to restarter(s) and resolver
  • one task for resolver
    • as mentioned above, dependency graph is a global structure, better have it not-distributed
  • one restarter task
    • need information from configuration — make it common
    • brokers needn't to distinguish restarters for different units
  • put resolver and restarter in the same (modular) task
    • they share configuration
    • eliminates IPC overhead in their communication
  • unit types (minimal set)
    • service AYKI (as you know it, from locsrv or ns)
    • driver (with multiple exposees)
    • mount
    • job (e.g. startup GUI application)
    • target (~runlevel)
    • configuration (load (new) configuration, allows using same logic from early userspace to user session)
  • task monitoring
    • should be a new responsibility of ns
    • begin with single tasks and extend it to spawning tree operations (integral approach from previous post)
Last edited 9 years ago by Michal Koutny (previous) (diff)

comment:8 by Michal Koutny, 9 years ago

See also: #414#414, #443

comment:9 by Michal Koutny, 9 years ago

See also: #414, #443#414, #443, #184

comment:10 by Jakub Jermář, 8 years ago

Milestone: 0.6.1
Note: See TracTickets for help on using tickets.