Wednesday, January 08, 2020

Learning Rust - Day 4 - Understanding Modules

I think I now get Modules in Rust. Well, sort of…I can’t be completely sure, but I think I have developed some form of logical heuristics that I can now use to think about the concept.

This is the 4th journal entry of my learning Rust journey. You can read other posts in this series by following the label learning rust.

Modules in Rust are a really slippery and confusing concept, so it took more than reading the 7th chapter of the book to arrive at some form of understanding. I had to scour through various blog posts, reddit posts, issues on Github, chats on Rust channel on Discord etc.

For such a mundane concept as Modules, I think it took more effort that it should, but then again, this is coming at no surprise, as I was already aware that Rust's module system is one of the most confusing parts of the language.

I won’t attempt a Rust module tutorial here; since this is a journal entry and also since I am not 100% sure if I have all the details locked down yet. But I would quickly pen down what I think is the core idea that helped me in getting the handle of the topic. If you know more about how the Rust module system works and fit together, and you spot a mistake in the things mentioned below, please do leave a comment! It would really be appreciated!

As anybody who have attempted to understand the module system in Rust will know, one of the confusing part is mapping the definition of the module, to the file system. The things that finally helped me was to think about the whole thing in the following ways:

  1. In Rust you have packages, which holds crates. And crates holds modules, where modules can be thought of as some sort of grouping mechanism for things like functions, traits, etc and helps in controlling the visibility of these things.
  2. Rust compilation unit forms a tree, where the root of the tree is the src/main.rs file and/or the src/lib.rs file. Anything that should be included in compilation must be reachable somehow from these two files. The module system also forms a tree, with the root starting from these two files.
  3. If a module definition is seen in a file, i.e mod modulename, this should be seen as the creation of a submodule by the parent module (which is the current file that contains the module definition) - I think understanding this is quite key!
  4. To define a module starting from the root (src/main.rs file and/or the src/lib.rs) or in a file that is already part of the compilation unit, you write mod mymodule; question now is, where do you place the content of this module. I think there are two ways: Inline or filesystem.
    1. Inline. Where the content is placed in {} after the module definition.
      mod mymodule {...content...}
    2. Filesystem. The content of the module can be found on the file system. There are two variants for this:
      1. modules defined in main.rs or lib.rs. The file would be placed also in the src directory. So if you have mod mymodule; defined in ./src/main.rs or ./src/lib.rs, then the content of that module can be found in a file mymodule.rs in the ./src directory.
      2. modules defined in other files apart from main.rs or lib.rs. If within mymodule.rs there is another module definition, then remember this means mymodule.rs is defining a sub module. The question now is, where would the content of this submodule be? Let us say the module definition is:
        mod mysubmodule;. It turns out the content would be found within an associated directory named /mymodule/. It seems apart from main.rs and lib.rs, any file that defines a sub module (ie contains a module definition) also has an associated directory where the content of the defined module would be placed. And the associated directory carries the same name. So in this case if within mymodule.rs we have mod mysubmodule; then there must be a directory named /mymodule/ where the content of the modules defined in mymodule.rs would be placed. There are now two ways of populating this directory:
        1. module name pattern. In this approach, the content of the module is placed in a file named just as the module name. In this case, within /mymodule/ there would be a file named mysubmodule.rs. So the full path would be /src/mymodule/mysubmodule.rs. I personally find this approach the simplest
        2. mod.rs pattern. In this approach, the content of the module would be placed in a file named mod.rs, that would then be placed in a directory matching the name of the defined module. In this case, the full path would be src/mymodule/mysubmodule/mod.rs. I personally find this approach more confusing.
  5. All the definition of modules, submodules etc leads to the idea of a path (similar to what you have in a file system) where modules that are part of the tree can be reached by specifying their path. The path separator in this case is :: and the root is crate, hence you can have something like crate::mymodule::mysubmodule as the path that allows reaching mysubmodule
  6. When there is a need to use a definition within a module, It is then best, not to think of it as importing the module, but using the path to reach the module. Having the path that reaches the module from where it exist in the module tree then gives access to its content (subject to the visibility rules of course). This is what the use keyword allows. So at the top of a file, having use crate::mymodule::mysubmodule means the path to reach mysubmodule is in scope and this can be used to access the module.
Perhaps after I have spent more time with the language, and I am more certain about how the module system fits together,  I would be able to write a more extensive posts with step by step examples. But now, I would only mention that Rust's module system is atypical, hence the intuition from other programming languages does not help in understanding how modules work in Rust. In fact, what you know about modules (or related concepts from your favourite language) is probably going to be a stumbling block.

Here are some of the links I ran into, that helped in shedding more light to the topic.

Rust modules vs files
The Rust module system is too confusing
Inline module syntax is confusing when learning modules (GitHub issue)
Data point about the new module system learnability and musings about language stability (Rust Internals)

Apart from getting a handle of a module definition and its mapping to the file system, there are other points, I got from reading the 7th chapter that is worth mentioning:

  1.  Visibility is defined using the pub keyword.
  2. Structs can have parts of their defined field public while others non public. This is not the case with enums.

Even though this chapter took more effort than the previous chapters, I am still quite happy with the results. As mentioned in the previous post, I was a little bit anxious about approaching modules in Rust, so even though it took some time, I think I was able to get some good understanding in a reasonable amount of time.

So looking forward to continuing with the rest of the book!

No comments: