Sunday, January 26, 2020

Learning Rust - Day 8 - An I/O Project: Building a Command Line Program

This is the 8th journal entry of my learning Rust journey. It captures some of the learning points while going through chapter 12 of the Rust Book. You can read other posts in this series by following the label learning rust.

Not much new concepts was introduced in this chapter. The aim was to apply the material presented in the book up till that point in building a trivial command line tool.

While working through the chapter though, I observed a couple of different strategies for dealing with errors via the Result<T,E> type.

Turn error to boolean via is_err
This seems handy when you want to convert the result to boolean based on whether the result was a success of not:

let x: Result<i32, &str> = Ok(-3);
assert_eq!(x.is_err(), false);

let x: Result<i32, &str> = Err("Some error message");
assert_eq!(x.is_err(), true);

Get success or run some logic via unwrap_or_else
This allows getting the success value, and in case of error, allows passing a callback function to process the error.

fn count(x: &str) -> usize { x.len() }

assert_eq!(Ok(2).unwrap_or_else(count), 2);
assert_eq!(Err("foo").unwrap_or_else(count), 3);

Ignore success and only run some code on failure via if let syntax
This seems to be useful in cases where you only want to do something in case calling a function returns an error.

if let Err(e) = run(config) {
    eprintln!("Application error: {}", e);
    process::exit(-1)
}

These are by far not all the available error handling strategies when dealing with Result in Rust. These were only the ones that I picked on while reading through chapter 12.

Another thing worth nothing, which is more or less like a culture shock, is the practice of putting unit tests in the same file as the code being testes. All languages I have used before now had the practice of having the tests external to the code being tested; but it seems in idiomatic Rusts, the tests go together with the implementation. This would require some getting used to!

That was it for this chapter. I now look forward to exploring the Functional Language Features in Rust as I proceed to chapter 13..

Friday, January 24, 2020

Learning Rust - Day 7 - Writing Automated Tests

This is the 7th journal entry of my learning Rust journey. You can read other posts in this series by following the label learning rust.

In this session, I read through Chapter 11 of The Rust Book. It was about writing automated tests in Rust. This chapter was a breeze; for obvious reasons. In fact, the most interesting things I learnt was not even about testing in Rust, but about another feature in the language: Attributes.

I have always noticed things like #[derive(Debug)], #![allow(unused_variables)], etc being used in the language, but I never stopped to actually read up on what they are. I know they were a facility of providing some form of metadata in the language; sort of like @annotations in Java, I just never took the time to find out what they are official called in Rust.

It ended up being that the testing mechanism in Rust revolves around the use of this language feature: notably #[cfg(test)] and #[test], so I took the opportunity to find out what exactly these things were.

They are Attributes and they are:
...a general, free-form metadatum that is interpreted according to name, convention, language, and compiler version...
They can exist in two forms. Outer attributes: the ones that starts with #! and Inner attributes: the ones that starts with  only #. The outer attribute is placed outside something - i.e. before a struct definition, function definition,  module definition etc - it applies to the thing that follows the attribute. The inner attribute is placed inside something. e.g when placed in the (root of a) crate in other to apply an attribute to the crate - it applies to the item that the attribute is declared within.

Attributes can also be classified into the following kinds: Built-in attributesMacro attributesDerive macro helper attributes, and Tool attributes. So far, so good, I find the Built-in attributes to be the most interesting ones, because, based on my knowledge of Rust, they are the ones I have mostly encountered. A list of these Built-in attributes can be found here.

Apart from learning more about Attributes, there were a couple of things I picked up about testing in Rust that are worth noting:

  • Use #[cfg(test)] attribute on module that contain test functions. Use #[test] attribute on test functions within the test module.
  • assert!assert_eq! and assert_ne! are macros that can be used for asserting test conditions.
  • favour assert_eq! and assert_ne! over assert! because they provide more useful messages in case of test failures and allow specifying of custom error messages on test failures. 
  • cargo test --help displays options that can be used with cargo test while
    cargo test -- --help displays the options you can use after the separator --. It looks like that latter can only be ran in the root of a rust project.
  • You can’t use the #[should_panic] annotation on tests that use Result<T, E>. An Err value would need to be returned to signify an expectation of error.
  • Run the tests with cargo test -- --test-threads=1 to prevent concurrent execution of tests 
  • Use #[ignore] attribute to ignoring some tests unless specifically requested.
  • To specifically run a single test, pass the name of the test function to cargo test. That is
    cargo test name_of_test_function 
  • Use cargo test -- --nocapture to also see the outputs (if any) from succeeding tests. By default outputs are shown only for failed tests
  • If there is setup code to be shared across different tests, then make sure to put them inside of tests/common/mod.rs instead of tests/common.rs. Not doing this would make the setup code appear in the test results.
  • Unit tests are placed in the same file as the module, while integration tests go into tests/integration_test.rs and they do not need #[cfg(test)] attribute; only the #[test] attribute is needed.
That was it for learning about writing automated tests in Rusts. I would be taking on Chapter 12 next. Which is: An I/O Project: Building a Command Line Program. It Looks like a chapter that would help solidify some of the concepts presented in the book thus far. Looking forward!

Thursday, January 23, 2020

Learning Rust - Day 6 - Generic Types, Traits, and Lifetimes

This is the 6th journal entry of my learning Rust journey. You can read other posts in this series by following the label learning rust.

In this study session I went through chapter 10 of the Rust book, which covers Generic Types, Traits, and Lifetimes. Generic types and Traits were easy to digest, as they are concepts I am already familiar with from other languages. It was Lifetime that proved to be the difficult nut to crack. Just like when going over the section about Modules, I had to consult other sources outside the book in other to be able to wrap my head around the concepts. I would not say I have it 100% locked down, but I think I now know enough of the general gist to proceed with the rest of the book.

Generics and Traits

Going over Generic Types and Traits was more or less about learning how these concepts are encoded in rust. It did introduced a lot of syntax, which I outline below:

Generic related syntax

// Function that defines the generic type    
fn generic_function<T>(input: T) -> T {
    unimplemented!()
}

// Generic Struct and Enum    
struct Point<T> {
         x: T,
         y:T    
     }

enum Color<T> {
       Red(T),
       Green(T),
       Blue(T)
     }

// Generic methods    
impl<T> Color<T> {
    fn get_hue<U>(&self) -> T {
      unimplemented!()
   }
}

Trait related syntax

// defining a trait   
pub trait Summary {
   fn summarize(&self) -> String;
}

// defining a trait with default implementation    
pub trait DefaultSummary {
   fn summarize(&self) -> String {
      unimplemented!()
   }
}

pub struct Tweet {
     pub username: String,
     pub content: String,
     pub reply: bool,
     pub retweet: bool,
}

// defining an instance of a trait for a type
impl Summary /* <- trait name */ for Tweet /**/ {
    fn summarize(&self) -> String {
        unimplemented!()
    }
}

// specifying function parameters as accepting traits    
// uses the impl trait syntax    
fn notify(text: impl Summary) -> String {
    text.summarize()
}

// specifying function parameters as accepting traits    
// uses the trait bound syntax    
fn another_notify<T: Summary>(text: T) -> String {
    text.summarize()
}

// specifying multiple traits for a type    
fn multiple_notify(text: impl Summary + Display) ->String {
    unimplemented!()
}

fn another_multiple_notifify<T: Summary + Display>(text:T) -> String {
    unimplemented!()
}

// specifying multiple traits with where clause    
// instead of    
fn some_function<T: Display + Clone, U: Clone + Debug>(t: T, u: U) -> i32 {
    unimplemented!()
}
    
// we have    
fn some_other_fn<T, U>(t:T, u: U) -> i32 where T: Display + Clone, U: Clone + Debug {
    unimplemented!()
}
    
// which looks better with new line    
fn some_other_f<T, U>(t:T, u:U) -> i32        
        where T: Display + Clone,
              U: Clone + Debug {
        unimplemented!()
}

// Returning Types that Implement Traits    
fn return_trait() -> impl Summary {
   Tweet {
     username: "".to_string(),
     content: "".to_string(),
     reply: false,
     retweet: false        
   }
}

// implementing methods on a generic struct, 
// if the type parameter implements some traits    
struct Pair<T> {
        x: T,
        y: T,
    }

// new method would be available, regardlass of T    
impl<T> Pair<T> {
     fn new(x: T, y: T) -> Self {
         Self {
             x,
             y,
      }
   }
}

// cmp_display would be available only if T has instance for Display 
// and PartialOrd    
impl<T: Display + PartialOrd> Pair<T> {
  fn cmp_display(&self) {
      if self.x >= self.y {
         println!("The largest member is x = {}", self.x);
      } else {
         println!("The largest member is y = {}", self.y);
       }
    }
 }

// blanket implementations: Can implement ToString for T    
// only if T already implements Display    
impl<T: Display> Summary for T {
   fn summarize(&self) -> String {
      unimplemented!()
   }
}

fn re(x: &str) -> &str {
   unimplemented!()
}

Some noteworthy learning around generics include:

Given a generic type T, T can be a type that can be on the heap or stack. It can't be determined from just the generic type signature. I think this have ramification when it comes to borrowing.

Some noteworthy learning around traits include:

It is only possible to implement a trait on a type only if either the trait or the type is local to my crate.

Basically either of the following scenario:
  • I have my local type, I have an external Trait. I can import the external trait and implement it for my local type.
  • I have a local Trait, I have an external type. I can import the external type and implement my trait for it.
The Trait bound syntax enforces that multiple function parameter implements same traits and are of the same type. This is not the case with impl trait syntax.

// first and second has to be the same type
// they also must have an implementation for trait Summary
fn multiple_function_same_type<T: Summary>(first: T, second: T) -> i32 {
    unimplemented!()
}

// fist and second can be a different type
// but they must have an implementation for trait Summary
fn multiple_function(first: impl Summary, second: impl Summary) -> i32 {
    unimplemented!()
}

If i have a function whose return type is represented with a trait, It is not possible from that function to have an implementation that could return any of the available implementations; i.e. via if else. The implementation should only be returning one type that implements that trait. Even if the types implements same traits. This restriction can be circumvented though, using Traits Objects. But I won't be learning about that, not until Chapter 17.

Lifetime

The bulk of the time in this study session was spent grokking (or trying to?) lifetimes. Even though I do not have the concept 100% locked down, some points I think are worthy of note. I list these below.

I think the big idea about lifetime is that they ensure all borrows are valid. That means ensuring that every use of & is valid. The borrow checker can be seen as the enforcer that is saddled with this task, and in other to perform it, it uses the concept of lifetime.

For example, given the following non compiling function definition:

fn longest(x: &str, y: &str) -> &str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
 

The use of & can either be a valid or invalid borrow.

The use of in the argument can be assumed to always be valid, since the value needs to exist in other for it to be borrowed into the function to be used. The same can't be said of the use of & in the return value.

The value the borrow in the return refers, can only be from two places. From within the function or from the input to the function.  If the borrow is to a value created within the function, then you have an automatically invalid situation, because once the function call is over the value would be cleared, leading to dangling pointer situation. 

In the case where the borrow in return is based off the input, then it becomes impossible to immediately tell if the value that was borrowed into the function would be valid for as long as the variable that ends up holding the return value. Hence why the above code snippet won't compile.

In other to make the above to compile, extra information needs to be provided that helps the borrow checker ensure that the returned borrow would indeed continue to be valid. This is done using lifetime annotations.

The updated function with lifetime annotation provided would look like this:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
} 

So what does this buy us? 

My interpretation is that we are telling Rust, that whenever this function is used, the inputs and the return value must be covered by the same lifetime. 

This means that for the return borrow, the scope should be shorter, or at-least as long as  those of the inputs. In case where the lifetime/scope of the return value is longer than any of the input, then the constraint of the lifetime annotation is being violated and Rust won't compile the code.

To make the above concrete, we take a look at two scenarios where the function above is used. One where the lifetime constraints are respected and another where they are violated

fn main() {
  let string1 = String::from("long string is long");

  {
    let string2 = String::from("xyz");
    let result = longest(string1.as_str(), string2.as_str());
    println!("The longest string is {}", result);
    // the lifetime of result ends here    
    // the lifetime of string2 ends here    
    // the lifetime of string1 continues    
    /* if result refers to string2, then it is fine, 
       because they have same lifetime */    
    /* if result refers to string1, it is fine, 
       because result has a shorter lifetime */  
   }
}

The above respects the lifetime annotations.

fn main() {
  let string1 = String::from("long string is long");
  let result;
  {
    let string2 = String::from("xyz");
    result = longest(string1.as_str(), string2.as_str());
    // the lifetime of result continues after here    
    // the lifetime of string2 ends here    
    // the lifetime of string1 continues after here    
    /* if result refers to string1 then it is fine,       
       because lifetime of result and string1 have same lifetime */
    /* but if result refers to string2, then it is not fine,      
       because lifetime of result is longer than string2,       
       hence violating the lifetime annotation */    
     }
    println!("The longest string is {}", result);
}

So the function signature:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str

...could be interpreted as the return value should never have a longer lifetime than any of the inputs. And I think this is inline with the general rule of borrow: The subject of the reference should live as long or longer as the variable reference it..

In the first entry in this series, I noted that The plan is that, after I become proficient in Rust, I can return to these series of posts and cringe at my ignorance! I think this post, out of the others in this series is that one where that statement is truest for the most! 😂

Anyways, I think these were the main points from reading through chapter 10. This chapter took longer that expected so I am really itching to continue with the rest of the book, hopefully no more stumps! 

Sunday, January 12, 2020

Learning Rust - Day 5 - Common Collections and Error Handling

This entry captures the highlights of reading through chapter 8 (Common Collections) and chapter 9 (Error Handling) of The Book. It marks my fourth entry in the Learning Rust journey. For more entries like this, see the posts with learning rust label.

I hang out in beginners section of Rust channel on Discord, and every now and then I see folks come around and say something like "I have been fighting with the borrow checker". I think that phrase came to life for me in this learning session. The last section of chapter 8 has 3 exercises, and while trying to work on those exercises, I found myself in situation where it feels like I am wrestling with the borrow checker! - I know, I know, the borrow checker is just trying to save me from my own stupidity - but right now it feel more like a struggle! 😂 And I guess these early encounters with the borrow checker signifies the beginning of my initiation rights towards becoming a Rusteacean?

Anyways, in these post, as I do, I pen down some of the things that stood out for me.

First thing was that + has a wired API when used for concatenating strings. It requires the second operand to be a borrow, while basically making the first string unusable after concatenation.

let hello = String::from("Hello");
let world = String::from(" world");
let hello_world = hello + &world; // world has to be a borrow
println!("{}", hello_world);
println!("{}", hello); // hello can't be used anymore due to move

The + has to be the most natural API for concatenating strings in previous languages I have worked with. Not sure the explanation for this unique incarnation in Rust. It seems, in other to get the more familiar behaviour, the format! macros should be used, and that basically provides string interpolation functionality:

let hello = String::from("Hello");
let world = String::from(" world");
let hello_world = format!("{}-{}",hello, world);
println!("{}", hello_world);
println!("{}", hello); // still usable
println!("{}", world); // still usable

I also learnt that strings in Rust does not support indexing. Which means one cannot access an individual characters in a string by referencing it by index. That is, this would lead to compile time error:

let hello = String::from("Hello world");
println!("{}", hello[0])

This is due to the fact that Rust stores string as an array (vector) of bytes, plus it supports UTF-8 by default. This means that a character could actually be represented by more than 1 byte, and in such a case, indexing that only returns a byte would make no sense. As the example that was given in the book shows:

print!("{}", String::from("Здравствуйте").len());

The above might look like being composed of 12 characters, and hence be represented by Rust with a vector with 12 slots. In reality this is not the case. The above is backed by a 24 long vector. Hence providing a facility for allowing indexing into a string to return a character does not makes sense.

What Rust provides, instead is ability to retrieve slices from a String. For example:

println!("{}", &hello[0..1])

And also the ability to turn a String into a unicode scalar values, or bytes, which can then be iterated over.

Also learnt Rust has a thing called deref coercion although this was not covered in this chapters. It would be in chapter 15.

While reading through the section on Hash Map, the or_insert caught my attention. The form it appeared in the book is similar to the following:

let mut map: HashMap<&str, i32> = HashMap::new();
let r = map.entry("1").or_insert(0);
*r += 10;
*r += 10;
println!("{}", map.get("1").unwrap()) // prints 10

I do not know enough Rust to know what is idiomatic or not, but the I find the UX around the API a little odd. The fact that an operation that inserts a value to a key, when that key is absent, also returns an handle to the value which can be used to mutate the value, feels overloaded. Also, one can assume if map.entry("1").or_insert(0) returns the value which can be later mutated, then map.entry("1") should also do the same, with the step of trying to create it if it is absent removed. But no, the API does not have that level of symmetry.

Also, I know mutation in Rust is safe because of the borrow checker, I am still not sure if that is a license to use mutation over the place. It might be easier for the compiler to ensure that mutation does not lead to bugs, not sure if that level of mutation would also ease a human in understanding the logic in the code.  I personally still have preference for the Functional programming approach of treating everything as immutable values, and deal with effects instead of mutation and side effects, as I think it also add to the simplicity and elegance of the code. Fingers crossed, and mind always open, as I learn more and use Rust, I get to understand its idiomatic use and philosophy. 

While going through the chapters for this study session, I had to revisit some of the information provided in Chapter 4 regarding ownership, slice types etc. I realised there are some cogent points I did not include in my Learning Rust - Day 2 - Getting Acquainted with Ownership post, about ownership/borrowing etc, so I am including them here. It is just some laws that governs how ownership et al. works. The items in bold are a verbatim copy from the Rust book.

  • Each value in Rust has a variable that’s called its owner.
  • This variable can either be a stack variable, that is, it points to data that resides in the stack or a heap variable, that is, it points to data on the heap.
  • When it points to the heap, then you can see the variable as containing the pointer, length, capacity etc: basically things you can see as ownership related data
  • There can only be one owner at a time.
  • When the owner goes out of scope, the value will be dropped.
  • When passing data on the heap to functions, instead of making a copy of the data, to pass the function, a pointer to the location of the data is passed instead. This is a reference. This act is called borrowing.
  • At any given time, you can have either one mutable reference or any number of immutable references. 


Turning to the Error handling chapter, I learnt about the panic = 'abort' that can be added to the Cargo.toml file. It has the effect of switching off the memory clean up Rust does when panic occurs. The memory would then have to be freed up by the operating system. Doing this has the added effect of reducing the size of the generated Rust binary. And by the way, it seems panic, is Rust lingo for exception.

I also learnt about the RUST_BACKTRACE environment variable that can be used to list the full path and actual portion of the user code that triggered the panic instead. You can see this as just having proper stack trace. interesting enough, this week I learnt this would no longer be needed, as recent changes to the Rust compiler now makes the panic messages to now pointing to the location where they were called, rather than core's internals 

Up until now, I never encountered the return keyword. But I now know that Rust has a return keyword, and that it is mostly use for explicit signifying of a return out of a function, especially where there could be multiple return path. Its usage is closely related to the elvis operator: ? which seems to embody the spirit of the do notation in languages like Haskell and PureScript. So far, I have only seen how to use it with Result type, but not the Option type.

As I mentioned in the beginning I had to fight with the borrow checker while implementing the exercises listed at the end of chapter 8. But what I found astonishing was amount of these errors were lifetime related!

Here is a list of some of the errors I remembered to capture:
  • these two types are declared with different lifetimes...
  • ^^^ borrowed value does not live long enough
  • ^ expected lifetime parameter
  • help: consider giving it a 'static lifetime: `&'static`
I guess I would be learning a ton when I start with the next chapter, which is about Generic Types, Traits... and you guess it...Lifetimes!!! I can't wait!

Wednesday, January 08, 2020

Learning Rust - Day 4 - Understanding Modules

I think I now get Modules in Rust. Well, sort of…I can’t be completely sure, but I think I have developed some form of logical heuristics that I can now use to think about the concept.

This is the 4th journal entry of my learning Rust journey. You can read other posts in this series by following the label learning rust.

Modules in Rust are a really slippery and confusing concept, so it took more than reading the 7th chapter of the book to arrive at some form of understanding. I had to scour through various blog posts, reddit posts, issues on Github, chats on Rust channel on Discord etc.

For such a mundane concept as Modules, I think it took more effort that it should, but then again, this is coming at no surprise, as I was already aware that Rust's module system is one of the most confusing parts of the language.

I won’t attempt a Rust module tutorial here; since this is a journal entry and also since I am not 100% sure if I have all the details locked down yet. But I would quickly pen down what I think is the core idea that helped me in getting the handle of the topic. If you know more about how the Rust module system works and fit together, and you spot a mistake in the things mentioned below, please do leave a comment! It would really be appreciated!

As anybody who have attempted to understand the module system in Rust will know, one of the confusing part is mapping the definition of the module, to the file system. The things that finally helped me was to think about the whole thing in the following ways:

  1. In Rust you have packages, which holds crates. And crates holds modules, where modules can be thought of as some sort of grouping mechanism for things like functions, traits, etc and helps in controlling the visibility of these things.
  2. Rust compilation unit forms a tree, where the root of the tree is the src/main.rs file and/or the src/lib.rs file. Anything that should be included in compilation must be reachable somehow from these two files. The module system also forms a tree, with the root starting from these two files.
  3. If a module definition is seen in a file, i.e mod modulename, this should be seen as the creation of a submodule by the parent module (which is the current file that contains the module definition) - I think understanding this is quite key!
  4. To define a module starting from the root (src/main.rs file and/or the src/lib.rs) or in a file that is already part of the compilation unit, you write mod mymodule; question now is, where do you place the content of this module. I think there are two ways: Inline or filesystem.
    1. Inline. Where the content is placed in {} after the module definition.
      mod mymodule {...content...}
    2. Filesystem. The content of the module can be found on the file system. There are two variants for this:
      1. modules defined in main.rs or lib.rs. The file would be placed also in the src directory. So if you have mod mymodule; defined in ./src/main.rs or ./src/lib.rs, then the content of that module can be found in a file mymodule.rs in the ./src directory.
      2. modules defined in other files apart from main.rs or lib.rs. If within mymodule.rs there is another module definition, then remember this means mymodule.rs is defining a sub module. The question now is, where would the content of this submodule be? Let us say the module definition is:
        mod mysubmodule;. It turns out the content would be found within an associated directory named /mysubmodule/. It seems apart from main.rs and lib.rs, any file that defines a sub module (ie contains a module definition) also has an associated directory where the content of the defined module would be placed. And the associated directory carries the same name. So in this case if within mymodule.rs we have mod mysubmodule; then there must be a directory named /mymodule/ where the content of the modules defined in mymodule.rs would be placed. There are now two ways of populating this directory:
        1. module name pattern. In this approach, the content of the module is placed in a file named just as the module name. In this case, within /mymodule/ there would be a file named mysubmodule.rs. So the full path would be /src/mymodule/mymodule.rs. I personally find this approach the simplest
        2. mod.rs pattern. In this approach, the content of the module would be placed in a file named mod.rs, that would then be placed in a directory matching the name of the defined module. In this case, the full path would be src/mymodule/mysubmodule/mod.rs. I personally find this approach more confusing.
  5. All the definition of modules, submodules etc leads to the idea of a path (similar to what you have in a file system) where modules that are part of the tree can be reached by specifying their path. The path separator in this case is :: and the root is crate, hence you can have something like crate::mymodule::mysubmodule as the path that allows reaching mysubmodule
  6. When there is a need to use a definition within a module, It is then best, not to think of it as importing the module, but using the path to reach the module. Having the path that reaches the module from where it exist in the module tree then gives access to its content (subject to the visibility rules of course). This is what the use keyword allows. So at the top of a file, having use crate::mymodule::mysubmodule means the path to reach mysubmodule is in scope and this can be used to access the module.
Perhaps after I have spent more time with the language, and I am more certain about how the module system fits together,  I would be able to write a more extensive posts with step by step examples. But now, I would only mention that Rust's module system is atypical, hence the intuition from other programming languages does not help in understanding how modules work in Rust. In fact, what you know about modules (or related concepts from your favourite language) is probably going to be a stumbling block.

Here are some of the links I ran into, that helped in shedding more light to the topic.

Rust modules vs files
The Rust module system is too confusing
Inline module syntax is confusing when learning modules (GitHub issue)
Data point about the new module system learnability and musings about language stability (Rust Internals)

Apart from getting a handle of a module definition and its mapping to the file system, there are other points, I got from reading the 7th chapter that is worth mentioning:

  1.  Visibility is defined using the pub keyword.
  2. Structs can have parts of their defined field public while others non public. This is not the case with enums.

Even though this chapter took more effort than the previous chapters, I am still quite happy with the results. As mentioned in the previous post, I was a little bit anxious about approaching modules in Rust, so even though it took some time, I think I was able to get some good understanding in a reasonable amount of time.

So looking forward to continuing with the rest of the book!

Monday, January 06, 2020

Learning Rust - Day 3 - Structs and Enums

I just completed the 3rd session of my journey of learning Rust. You can read other posts in this series by following the label learning rust.

This session saw me covering chapter 5 and chapter 6 of The <Rust> Book. These chapters are about structs and enums respectively.

This session was quite a breeze. This is due to the fact that the ideas presented in these chapters are really just about algebraic data types. And having encountered them before in Haskell, the session was more about seeing how the concepts is encoded in Rust, rather than learning new concepts.

In fact, I think the usual approach, that is prevalent in most Haskell literature of explaining these concepts i.e using Types, relation to set theory, data constructors etc is less confusing, less ad-hoc and makes things tie together nicely; especially when other concepts like pattern matching enters the picture. I guess taking time out to learn some Haskell wasn't a total waste of time after all :)

Having said that, there were still some things that stood out while going through these two chapters, and I enumerate them below. Yup, pun intended :)

When creating a struct you can't make individual fields as either immutable or mutable. You have to do this for the entire instance being created.

It is also possible to create a struct with fields unnamed. The term for this, in Rust lingo is tuple struct. And it seems the syntax for this uses () instead of {}. For example:

struct NormalPoint { x: i32, y: i32};
struct TupleStructPoint(i32, i32);

Since we are now in the terrain of systems programming, we can't afford to be totally oblivious to how this are put in the memory. <strike>When we have things like struct or enums they reside on the heap</strike> Not true. See here, and as we already know, a variable does not hold the value of such data. In Rust lingo, the variable is the "owner", and it holds ownership related information of which a pointer to the memory location on the heap is one of such information. If this is then the case, how then do we access things like the fields of a struct? Should we not first get a hold of the reference from the variable first, and then use that to access the fields? Well in Rust, no, because Rust has this thing called Automatic referencing and dereferencing which does this for you.

(&p1).distance(&p2); // Doing this is redundant
p1.distance(&p2); // Rust automatically does that for you

It is also possible to define function that is scoped to a particular struct or enum. This is called methods. The syntax is basically this:

impl StructOrEnum {
    fn your_function(&self) {
        
    }
}




Which I now sort of read as:

impl /*for an instance of: */ StructOrEnum {
  fn /*the following function:*/ your_function(&self) {
      
  }
}

When the defined function takes a &self as a parameter, such can be seen as instance methods: as you would call them in the OOP world. Without the &self,  then you have a static method. Although in Rust lingo, it seems static method are referred to as associated functions. Calling such associated functions involve the use of ::, eg:

struct AStruct;
impl AStruct {
    fn what_am_i() -> () {
        println!("{}", "I'm a struct")
    }
}

AStruct::what_am_i(); // prints "I'm a struct"

Also, there exist a if let syntax that can be used with pattern matching, when one is only concerning with dealing with one of the patterns and ignoring the rest: effectively turning it into an if...else construct. Basically this:

enum PrimaryColor {
    Red,
    Green,
    Blue}

fn is_red(color: PrimaryColor) -> bool {
    if let color = PrimaryColor::Red {
        true    } else {
        false    }
}

println!("{}", is_red(PrimaryColor::Red));

Still not sure how I feel about this syntax, Looks like something that shouldn't be used that much?

As it now seems to be customary of these journalling post, I have some info that I picked up that could be presented in a QnA format:

Q. How do I print out a struct or enum to the console?
     A. I either implement the Display trait, or the Debug trait. The Rust compiler can derive an implementation the Debug trait for me, I just need to place #[derive(Debug)] at the top of the struct/enum and then use {:?} or {:#} instead of {} in the println! macros.

Q. How do I create an instance of a struct from another struct.
     A. Use the double dot syntax. Since a code snippet is better than a thousand words, here you go:

struct Point {x:i32, y:i32}
let p1 = Point {x:0, y:0};
let p2 = Point { x: 1, ..p1 };
println!("x is: {}, y is: {}", p2.x, p2.y); // prints x is: 1, y is: 0

My next study session would be about modules in Rust. I am slightly apprehensive about this, as I have heard that this is one of the unnecessarily confusing part of the language. Finger's crossed, i'll definitely be journalling about how it turns out! :)

Sunday, January 05, 2020

Learning Rust - Day 2 - Getting Acquainted with Ownership.

So, I just finished day 2 of my learning Rust journey. You can read other posts in this series by following the label learning rust.

Note: this is the journaling of my experience learning Rust, where I pen down some of the insights I am picking up, things learnt, and things I can't yet wrap my head around in the language; not leaving out things I considered queer or inconvenient. As I said in the beginning post: The plan is that, after I become proficient in Rust, I can return to these series of posts and cringe at my ignorance!

This session saw me going through the 3rd and the 4th chapter of The book. The 3rd chapter, which is titled Common programming concepts was a breeze. The 4th chapter, which is about Understanding Ownership did not go that fast, as I encountered new ideas I had to let sink in. Below are some of the things that stood out for me at the end of the day:

You cannot declare and assign same value to multiple variables at a go in Rust. This wont work:

fn main() {
    let a = b = 10; // this wont work
}

Or any other variant of the above.

This has to do with the fact that assigning a value to a variable is a statement, and does not yield any value. I also got to confirm what I mentioned in Learning Rust - Day 1, about the fact that appending a line with ; signifies a statement in Rust, and absence of ; signifies an expression.

I also realised that the loop keyword is also an expression, and it is possible to return a value upon breaking out of the loop. The loop below would return 10 and that would be printed:

fn main() {
    let mut a = 0;
    let r = loop {
        a = a + 1;
        if a == 10 {
            break a
        }
    };

    println!("{}", r)
}


I also had to refresh my understanding of basic concepts around pointers, reference, stack and heaps. I found The 5 minutes guide to C pointers post particularly useful in motivating pointers and their syntax. This segued into Understanding ownership in Rust.

The key take away about ownership is the fact I now need to be mindful about assigning stuff to variables, and passing variables into a function, and what happens in terms of where things are in memory (stack or heap).

Previously I could get away with a mental model that sees variable assignment and passing arguments to a function as a by-value operation. i.e. values are copied and given to the new variable or function. No more, at-least for data that resides on the heap (this still applies for data on the stack). When such operations happen, instead of thinking about values being passed, think about ownership being moved, where the "ownership content" can be thought of as the pointer to the memory location of the data, the current size of the data, and the reserved memory capacity in the heap.

With the above, then it becomes trivial to see why the following piece of code won't compile:

fn main() {
    let s = String::from("hello world");
    let ss = s; // ownership has moved from s to ss

    println!("{}", s); // s is now invalid since ownership moved
    println!("{}", ss);
}

compile error:

   Compiling chapter_3_4 v0.1.0 (/Users/chapter_3_4)
error[E0382]: borrow of moved value: `s`
  --> src/main.rs:11:20
   |
8  |     let s = String::from("hello world");
   |         - move occurs because `s` has type `std::string::String`, which does not implement the `Copy` trait
9  |     let ss = s; // ownership has moved from s to ss
   |              - value moved here
10 |
11 |     println!("{}", s); // s is now invalid since ownership moved
   |                    ^ value borrowed here after move

error: aborting due to previous error

For more information about this error, try `rustc --explain E0382`.
error: Could not compile `chapter_3_4`.

To learn more, run the command again with --verbose.

Process finished with exit code 101


Or why, even when applying borrowing via the use of &, the following won't compile:

fn main() {
    let mut s = String::from("hello world");
    let ss = &mut s;

    println!("{}, {}", ss, s);
}

nor this:

fn main() {
    let mut s = String::from("hello world");
    let ss = &mut s;

    println!("{}", s);
    println!("{}", ss);
}

But his will:

fn main() {
    let mut s = String::from("hello world");
    let ss = &mut s;

    println!("{}", ss);
    println!("{}", s);
}

and even the first example will compile with some slight modification (replacing let ss = &mut s with let ss = &s)

fn main() {
    let mut s = String::from("hello world");
    let ss = &s;

    println!("{}, {}", ss, s);
}

All these is due to the interplay between a mutable reference, and borrowing same reference down the line as immutable.

In general a high level summary of what I took away from the chapter about ownership is:

1. Understand scope, ownership and how ownership could be moved and lost due to assignment, passing into arguments and variable going out of scope

2. Understanding borrowing of ownership via the use of &, and the difference between mutable and immutable reference and the various rules that ensure these interplay safely.

The intuition I am getting is that in Rust, it is ensured that at any point in time, it is not possible for multiple variables to have ownership to same address on the heap memory, and in the case of reference, you have reference to ownership, and even though you can have multiple of these, the compiler ensures this is done in an orderly matter that does not lead to memory access bugs.

There were also a couple of tit bits I picked up:

Q. How do you silence the unused variable warning?
     A. By placing #![allow(unused_variables)] at the top of the function where the unused variable occurs, at the top of assignment of the unused variable, or by prepending the variable with an _

Q. Does Rust have something similar to typed holes like you have in Haskell or PureScript?
     A. Not quite. But there is a mechanism that comes close. Use _:(). For example: 
let _:() = String::from("some value"); The compiler error message of this will contain something like expected type `()` found type `std::string::String`.

That is it for now!

Friday, January 03, 2020

Learning Rust - Day 1

So I decided to learn Rust. I have had it in mind for a while, but decided to start now. I see it as my contribution to the whole new year, new goals, new resolution brouhaha currently ravaging everywhere.

And since I am temporarily off twitter*, my go to place for shouting into the void, I am choosing to blog my experience here on my blog (like it’s 2006!). So basically I would be going through The Book as much as time allows me per week, and I would be journaling the good, the bad, and the utterly ugly things I find on the way here!

The plan is that, after I become proficient in rust, I can return to these posts and cringe at my ignorance!

So let us get started.