susam a day ago

About 20 or so years ago, I have come across a configuration pattern that could be arguably called "Level 0". It was configuration by file existence. The file itself would be typically empty. So no parsing, syntax, or schema involved. For example, if the file /opt/foo/foo.txt exists, the software does one thing but if it is missing the software does another thing. So effectively, the existence of the file serves as a boolean configuration flag.

  • bangonkeyboard a day ago

    A top-level /AppleInternal/ directory on macOS, even if empty, will enable certain features in Apple developer tools.

    • LeifCarrotson a day ago

      I've observed similar things to happen in other tools too - third-party tools that will list a certain set of file formats by default, and enable Autodesk formats if C:/Autodesk is present.

      It seems to be most common in systems that cross major business domains (or come from completely separate companies). If you could just add an entry to the featureful, versioned, controlled config file then you'd do that, but if you can't developers frequently resort to the "does this path name resolve" heuristic.

  • somat a day ago

    A number of traditional unix utilities change their behavior based on what their name is. /bin/test and /bin/[ come to mind. but I just checked and a quick survey of openbsd finds.

        eject mt
        [ test
        chgrp chmod
        cksum md5 sha1 sha256 sha512
        cpio pax tar
        ksh rksh sh
    
    taken to it's logical extreme you end up with somthing like crunchgen https://man.openbsd.org/crunchgen which merges many independent programs into one and select which one to run based on the name.

    And I am guilty of abusing symbolic links as a simple single value key value store. It turns out the link does not need to point to anything and using readlink(1) was easier than parsing a file.

    • cduzz a day ago

      ssh and rsh used to cause you to rsh/ssh to the name of the file...

      so I had, in my home directory

      ~me/bin/snoopy

      and if I wanted to log into snoopy, I'd just type

      $ snoopy

      and it'd rsh me into snoopy.

      Hilarity was the day when someone ran

        cd /export/home && for user in * ; do chown -R $user:users ; done
      
      (note the lack of the -h flag, which causes the ownership of the symbolic link instead of the reference of the symbolic link to be chowned)
  • noahjk 12 hours ago

    Similar behavior in the s6 overlay framework for containers. Some files do things just by existing IIRC

  • ks2048 a day ago

    This gives me an idea - store small integer parameters (<= 511) as file permissions (r/w/exe for user/group/other) on an empty file.

    • rzzzt a day ago

      Larger integers can go into UID:GID values.

  • umbra07 a day ago

    I use this approach for testing conditional logic in shell scripts sometimes.

  • esafak a day ago

    That's just cramming a flag into the file system.

alexambarch a day ago

I’d argue Terraform/HCL is quite popular as a Level 4 configuration language. My biggest issue with it is that once things get sufficiently complex, you wish you were using a Level 5 language.

In fact, it’s hard to see where a Level 4 language perfectly fits. After you’ve surpassed the abilities of JSON or YAML (and you don’t opt for slapping on a templating engine like Helm does), it feels like jumping straight to Level 5 is worth the effort for the tooling and larger community.

  • default-kramer a day ago

    I'm very surprised we don't see more people using a level 5 language to generate Terraform (as level 3 JSON) for this exact reason. It would seem to be the best of both worlds -- use the powerful language to enforce consistency and correctness while still being able to read and diff the simple output to gain understanding. In this hypothetical workflow, Terraform constructs like variables and modules would not be used; they would be replaced by their counterparts in the level 5 language.

    https://developer.hashicorp.com/terraform/language/syntax/js...

    • JanMa a day ago

      That actually works quite well. I once built a templating engine for Terraform files based on JQ that reads in higher level Yaml definitions of the resources that should be created and outputs valid Terraform Json config. The main reason back then was that you couldn't dynamically create Terraform provider definitions in Terraform itself.

      Later on I migrated the solution to Terramate which made it a lot more maintainable because you write HCL to template Terraform config instead of JQ filters.

    • harshitaneja a day ago

      This is exactly how we do it with our very own rudimentary internal library and scripts. Barely enough and even though I worry at times it will break unexpectedly but so far we are surprised by how stable everything has been.

      Although I really wish there was a first party solution or a well established library for this but I suspect that while it is easy to build only enough to support specific usecases but building a generic enough solution for everyone would be quite an undertaking.

    • rattyJ2 15 hours ago

      This is basically how pulumi and tfsdk work. You write go/ts/python/... that generates a bunch of config files to a temp folder, and then reconciles those.

  • danpalmer a day ago

    The problem with HCL is that it's a Level 4 language masquerading as a Level 3 language, rather than a Level 4 language masquerading as a Level 5 (like Starlark, Dhall, even JSONNET). Because of that its syntax is very limited and it needs awkwardly nuanced semantics, and becomes difficult to use well as a result.

    HCL is best used when the problem you're solving is nearly one you could use a level 3 language for, whereas in my experience, Starlark is only really worth it when what you need is nearly Python.

  • miningape a day ago

    The choice between 4 and 5 is more about what you get to avoid. By choosing level 5 you are opening the possibility to make some really complicated configurations and many more footguns. When you stay at level 4 you're forced into using more "standardised" blocks of code that can easily be looked up online and understood.

    Level 4 is also far more declarative by nature, you cannot fully compute stuff so a lot is abstracted away declaratively. This also leads to simpler code since you're less encouraged to get into the weeds of instantiation and rather just declare what you'd like.

    Overall it's about forcing simplicity by not allowing the scope of possibilities to explode. Certainly there are cases where you can't represent problems cleanly, but I think that tradeoff is worth it because of lowered complexity.

    Another benefit of level 4 is that it's easier for your code can stay the same while changing the underlying system you're configuring. Since there's a driver layer between the level 4 configuration and the system which can (ideally) be swapped out.

sgeisenh a day ago

> Don't waste time on discussions within a level.

I disagree with this. YAML has too many footguns (boolean conversions being the first among them) not to mention it is a superset of JSON. Plain old JSON or TOML are much simpler.

  • xelxebar a day ago

    > YAML has too many footguns (boolean conversions being the first among them)

    Copying my own comment from elsewhere: https://news.ycombinator.com/item?id=43670716.

    This has been fixed since 2009 with YAML 1.2. The problem is that everyone uses libyaml (_e.g._ PyYAML _etc._) which is stuck on 1.1 for reasons.

    The 1.2 spec just treats all scalar types as opaque strings, along with a configurable mechanism[0] for auto-converting non-quoted scalars if you so please.

    As such, I really don't quite grok why upstream libraries haven't moved to YAML 1.2. Would love to hear details from anyone with more info.

    [0]:https://yaml.org/spec/1.2.2/#chapter-10-recommended-schemas

  • sevensor 19 hours ago

    Lack of nulls in toml is a headache. No two yaml libraries agree on what a given yaml text means. Although json is bad at numbers, that’s more easily worked around.

18172828286177 a day ago

> Don't waste time on discussions within a level. For example, JSON and YAML both have their problems and pitfalls but both are probably good enough.

Disagree. YAML is considerably easier to work with than JSON, and it’s worth dying on that hill.

  • MOARDONGZPLZ a day ago

    I love that there is one comment saying JSON is better, and then yours saying YAML is better.

    • drewcoo a day ago

      To be fair, both say "don't waste time" on it.

  • zzo38computer a day ago

    I don't really like either format (I am not sure which is worse; both have significant problems). YAML has some problems (such as Norway problem and many other problems with the syntax), and JSON has different problems; and some problems are shared between both of them. Unicode is one problem that both of them have. Numbers are a problem in some implementations of JSON but it is not required. (Many other formats have some of these problems too, such as using Unicode, and using floating numbers and not integers, etc.)

    I think DER is better (for structured data), although it is a binary format, but it is in canonical form. I made up the TER format which is a text format which you can compile to DER, and some additional types which can be used (such as a key/value list type). While Unicode is supported, there are other (sometimes better) character sets which you can also use.

    (However, not all configuration files need structured data, and sometimes programs are also useful to include, and these and other considerations are also relevant for other formats, so not everything should use the same file formats anyways.)

  • dijksterhuis a day ago

    anchors/aliases/overrides are one of my favourite yaml features. i've done so much configuration de-duplication with them, it's unreal.

ajb a day ago

I'm not convinced by reducing this to a single dimension. There are differences in both 'what can be expressed' and 'what validation can be done' which are somewhat independent of each other

  • qznc a day ago

    Hm, you got me thinking about reversible computing and how it could be applied to configuration.

    Debugging a configuration becomes tedious once computation is involved. You think some value should be "foo" but it is "bar". Why is it "bar"? If someone wrote it there, the fix is simply to change. If "bar" is the result of some computation, you have to understand the algorithm and its inputs, which is significantly harder.

    Given a "reversible" programming language that might be easier. Such languages are weird though and I don't know much about them. For example: https://en.wikipedia.org/wiki/Janus_(time-reversible_computi...

    • ajb a day ago

      Interesting idea! Although, maybe you just want to be able to run the configuration language in a reversible debugger?

      This issue becomes even harder when you have some kind of solver involved, like a constraint solver or unification. As a user the solver is supposed to make your life easier but if it rejects something without a good enough error message you are stuck; having to examine the solver code to work out why is a much worse experience than not having a solver. (This is the same issue with clever type systems that need a solver)

waynecochran a day ago

https://jsonnet.org/ I never heard of this before. This seems like the JSON I wish I really had. Of course at some point you could just use JavaScript. I guess that fits w option 5.

  • rssoconnor a day ago

    Dave Cunningham created jsonnet from some conversations I had with him about how Nix's lazy language allows one to make recursive references between parts of one's configuration in a declarative way. No need to order the evaluation beforehand.

    Dave also designed a way of doing "object oriented" programming in Nix which eventually turned into what is now known as overlays.

    P.S. I'm pretty sure jsonnet is Turing complete. Once you get any level of programming, it's very hard not to be Turing complete.

    • pwm 17 hours ago

      I would love to read more about all this, especially how overlays came from "object-oriented" programming. To me, the interesting part is their self-referential nature, for which lazy eval is indeed a great fit!

      For anyone interested, this is how I'd illustrate Nix's overlays in Haskell (I know I know, I'm using one obscure lang to explain another...):

        data Attr a = Leaf a | Node (AttrSet a)
          deriving stock (Show, Functor)
      
        newtype AttrSet a = AttrSet (HashMap Text (Attr a))
          deriving stock (Show, Functor)
          deriving newtype (Semigroup, Monoid)
      
        type Overlay a = AttrSet a -> AttrSet a -> AttrSet a
      
        apply :: forall a. [Overlay a] -> AttrSet a -> AttrSet a
        apply overlays attrSet = fix go
          where
            go :: AttrSet a -> AttrSet a
            go final =
              let fs = map (\overlay -> overlay final) overlays
              in foldr (\f as -> f as <> as) attrSet fs
      
      Which uses fix to tie the knot, so that each overlay has access to the final result of applying all overlays. To illustrate, if we do:

        find :: AttrSet a -> Text -> Maybe (Attr a)
        find (AttrSet m) k = HMap.lookup k m
      
        set :: AttrSet a -> Text -> Attr a -> AttrSet a
        set (AttrSet m) k v = AttrSet $ HMap.insert k v m
      
        overlayed =
          apply
            [ \final prev -> set prev "a" $ maybe (Leaf 0) (fmap (* 2)) (find final "b"),
              \_final prev -> set prev "b" $ Leaf 2
            ]
            (AttrSet $ HMap.fromList [("a", Leaf 1), ("b", Leaf 1)])
      
      we get:

        λ overlayed
        AttrSet (fromList [("a",Leaf 4),("b",Leaf 2)])
      
      Note that "a" is 4, not 2. Even though the "a = 2 * b" overlay was applied before the "b = 2" overlay, it had access to the final value of "b." The order of overlays still matters (it's right-to-left in my example tnx for foldr). For example, if I were to add another "b = 3" overlay in the middle, then "a" would be 6, not 4 (and if I add it to the end instead then "a" would stay 4).
      • rssoconnor 16 hours ago

        I have the following file called oop.nix dated from that time.

            # Object Oriented Programming library for Nix
            # By Russell O'Connor in collaboration with David Cunningham.
            #
            # This library provides support for object oriented programming in Nix.
            # The library uses the following concepts.
            #
            # A *class* is an open recursive set.  An open recursive set is a function from
            # self to a set.  For example:
            #
            #     self : { x = 4; y = self.x + 1 }
            #
            # Technically an open recursive set is not recursive at all, however the function
            # is intended to be used to form a fixed point where self will be the resulting
            # set.
            #
            # An *object* is a value which is the fixed point of a class.  For example:
            #
            #    let class = self : { x = 4; y = self.x + 1; };
            #        object = class object; in
            #    object
            #
            # The value of this object is '{ x = 4; y = 5; }'.  The 'new' function in this
            # library takes a class and returns an object.
            #
            #     new (self : { x = 4; y = self. x + 1; });
            #
            # The 'new' function also adds an attribute called 'nixClass' that returns the
            # class that was originally used to define the object.
            #
            # The attributes of an object are sometimes called *methods*.
            #
            # Classes can be extended using the 'extend' function in this library.
            # the extend function takes a class and extension, and returns a new class.
            # An *extension* is a function from self and super to a set containing method
            # overrides.  The super argument provides access to methods prior to being
            # overloaded.  For example:
            #
            #    let class = self : { x = 4; y = self.x + 1; };
            #        subclass = extend class (self : super : { x = 5; y = super.y * self.x; });
            #    in new subclass
            #
            # denotes '{ x = 5; y = 30; nixClass = <LAMBDA>; }'.  30 equals (5 + 1) * 5).
            #
            # An extension can also omit the 'super' argument.
            #
            #    let class = self : { x = 4; y = self.x + 1; };
            #        subclass = extend class (self : { y = self.x + 5; });
            #    in new subclass
            #
            # denotes '{ x = 4; y = 9; nixClass = <LAMBDA>; }'.
            #
            # An extension can also omit both the 'self' and 'super' arguments.
            #
            #    let class = self : { x = 4; y = self.x + 1; };
            #        subclass = extend class { x = 3; };
            #    in new subclass
            #
            # denotes '{ x = 3; y = 4; nixClass = <LAMBDA>; }'.
            #
            # The 'newExtend' function is a composition of new and extend.  It takes a
            # class and and extension and returns an object which is an instance of the
            # class extended by the extension.
            
            rec {
              new = class :
                let instance = class instance // { nixClass = class; }; in instance;
            
              extend = class : extension : self :
                let super = class self; in super //
                 (if builtins.isFunction extension
                  then let extensionSelf = extension self; in
                       if builtins.isFunction extensionSelf
                       then extensionSelf super
                       else extensionSelf
                  else extension
                 );
            
              newExtend = class : extension : new (extend class extension);
            }
        
        In nix overlays, the names "final" and "prev" used to be called "self" and "super", owing to this OOP heritage, but people seemed to find those names confusing. Maybe you can still find old instances of the names "self" and "super" in places.
        • pwm 15 hours ago

          Thanks for this! It's always nice to learn about the origins of things. I was around when they were called "self"/"super", but I never made the connection to OOP.

  • liveify 19 hours ago

    I made a decision early on in a project to replace YAML with jsonnet for configuration and it was the best decision I made on that project - I’ve written tens of thousands of lines of jsonnet since.

behnamoh a day ago

Lisp code is represented in the same data structure it manipulates. This homoiconicity makes Lisp code be a great config data especially in a Lisp program. In comparison, you can't represent JS code in JSON.

chubot a day ago

Hm I also made a taxonomy of 5 categories of config languages, which is a bit different

Survey of Config Languages https://github.com/oils-for-unix/oils/wiki/Survey-of-Config-...

    Languages for String Data
    Languages for Typed Data
    Programmable String-ish Languages
    Programmable Typed Data
    Internal DSLs in General Purpose Languages
Their taxonomy is:

    String in a File
    A List
    Nested Data Structures
    Total Programming Languages
    Full Programming Language
So the last category (#5) is the same, the first one is different (they start with plain files), and the middle is a bit different.

FWIW I don’t think “Total” is useful – for example, take Starlark … The more salient things about Starlark are that it is restricted to evaluate very fast in parallel, and it has no I/O to the external world. IMO it has nothing to do with Turing completeness.

Related threads on the “total” issue:

https://lobste.rs/s/dyqczr/find_mkdir_is_turing_complete

https://lobste.rs/s/gcfdnn/why_dhall_advertises_absence_turi...

ks2048 a day ago

> I actually like XML. It isn't "cool" like YAML anymore, but it has better tooling support (e.g. schema checking) and doesn't try to be too clever. Just try to stay away from namespaces and don't be afraid of using attributes.

I agree with this (even though, In practice I usually just use JSON or YAML) - it avoids some of the pitfalls of both JSON and YAML - has comments, lacks ambiguity. The main annoyances are textContent (is whitespace important?), attributes vs children, verbosity of closing tags, etc.

  • retropragma a day ago

    Every time I work with XML data, I hate it. Just use JSONC imo.

bob1029 a day ago

I think SQL is one of the best level 4/5 configuration languages out there. Whether or not it's a "full programming language" depends on your specific dialect and how it's used.

  • danpalmer a day ago

    Only if what you specifically want is to represent queries. If what you want to represent is roughly static data then SQL is an incredibly awkward language to use.

    • bob1029 20 hours ago

      This is valid SQL:

        SELECT 'Constant'
      • danpalmer 2 hours ago

        Sure, but try creating a list of maps with string keys and values that can be ints, strings, lists, or maps, or protos for that matter.

        I've recently been writing quite a lot of unit tests for SQL, and the biggest pain point is setting up the state of your data, because SQL is just not an ergonomic language in which to do that. Mostly I've just ended up writing YAML describing the rows I want in a database (and I use proto fields in DBs so some nesting there), and using a utility that converts that can load that YAML as a table.

ks2048 a day ago

I'm not sure "total" vs "turing-complete" should be a huge difference - just terminate with an error after X seconds.

For example, can "total programming languages" include: "for i in range(10000000000000): do_something()"?

If so, your config file can still hang - even though it provably terminates.

  • lmm a day ago

    It's a lot easier to accidentally make a file that takes forever than to accidentally make a file that takes a long but finite amount of time.

rdtsc a day ago

That’s a good breakdown.

In practice configuration systems that reach level 4 or 5 start to look complex and the whole thing ends up with a new rewrite into a level 2 or 3. After a while it expands, because it needs comments, include files, templating, for loops, etc., until it becomes a total mess and gets thrown out and we cycle back to level 3.

iambvk a day ago

IMO there is another level in between 3 and 4 where config file allows for cross referencing among config values, forming a graph.

   foo.password = xxx
   bar.password = yyy
   wifi.ssid = foo
kristopolous a day ago

The distance between 3 and 4 are wider and have more members. Take CSS for example.

  • qznc a day ago

    CSS is at least level 4. You can even argue for level 5, i.e. Turing-complete: https://stackoverflow.com/questions/2497146/is-css-turing-co...

    • kristopolous a day ago

      Level 4 is turing complete.

      There's two parts that I was talking about. Things that are not quite that and the fact that configuration can have that capability in a fairly useless context.

      When I'm dealing with personal things or stuff that few people use I will often make the configuration just something I eval/source.

      So it in theory has the same functionality as the underlying programming language, but in practice you're just supposed to use it like an INI.

      Here's a fairly large personal project where I use that

      https://github.com/kristopolous/music-explorer

      It actually allowed me to change the behavior on whether I'm running my program from my office or home. So invoking the full fidelity of the underlying language actually has its benefits at times.

bblb a day ago

"vibeconfig"

1. You give LLM the requirements

2. It spits out whatever monstrosity is required to configure the software or service in question

3. When issues later arise, you just vibeconfig again with new requirements

Eventually new vibeconfig tools will rise because even those three steps are not complex enough. These call LLM APIs to inject the config files dynamically at runtime. "But it's a security issue". So another vertical is born: auditing and securing the vibeconfig LLM autogeneration toolsets.

somat a day ago

For my personal projects I am drifting towards the simple side. More and more I try to stick with a simple single level key value configuration.

But if you do require a complex configuration, I think it is beneficial to both yourself and your users to invest in a openbsd style parse.y like solution rather than just shovel the usual json or yaml slop.

https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/usr...

By which I mean it does not have to be yacc, but taking the time to think about the language of your configurations improves the ui dramaticly.

James_K a day ago

"Use the lowest level possible" has always seemed rather stupid advice to me. What I suggest: use XML. Every programming language under the sun can spit out XML files, so you can generate them programmatically if needed, and it's not as if you'll ever sit there wishing you'd gone for a simpler format. Sachems make the files practically self-documenting and the tooling for them is brilliant.

jiggawatts a day ago

In my opinion there's a "level 4.5" in between structured templating and full-blown procedural scripting: Using a general-purpose language to generate structured data, but then handing that over to a simpler system to materialise.

Pulumi is the best known example. Also, any time a normal programming language is used to generate something like an ARM template or any other kind of declarative deployment file.

This is the best-of-all-worlds in my opinion: Full capability, but with the safety of having an output that can be reviewed, committed to source control, diff-ed, etc...

  • tracnar 19 hours ago

    Agreed. Also if you can generate your configuration at build time, it matters much less whether you use a Turing complete language or not. It then allows you to enforce limitations you care about, like e.g. forbidding network access, or making sure it builds within X seconds.

runeblaze a day ago

But jsonnet is Turing complete…?

AtlasBarfed a day ago

This is just the complexity in individual files!

Configuration can be a lot more complicated. Look at dockerfiles, which are filesystems overlaid over each other, often sourced from the internet.

https://docs.spring.io/spring-boot/reference/features/extern...

Look at that: a massive 15 deep precedence order for pulling just individual values (oh man, doesn't even touch things like maps/lists that get merged/overridden).

That includes sources like the OS, environment-specific, a database (the JNDI registry), XML, JSON, .properties files, hardcodes. Honestly, I remember this being even deeper, I suspect they have simplified this.

This doesn't even get into secrets/secured configuration, which may require a web service invocation. I used to also pull config via ssh, or from private gits or github, from aws web service calls (THAT required another layer of config getting a TOTP cycled cred).

https://crushedby1sand0s.blogspot.com/2021/02/stages-of-desp...

I was right, the Spring config fallthru was deeper.