Saturday, September 19, 2015

Krwkrw 0.1.3 Released

Just pushed the latest release (0.1.3) of Krwkrw to Maven central.

Krwkrw is a web scraper. You can read how it came into being here

A quick run down of the changes in 0.1.3.
  1. Ability to express URL's to be included/excluded using Regex pattern. For example:
    Krwkrw crawler = new Krwkrw(action);
    crawler.match("(\\S+)(/projects/)(\\S+)")
    
    makes sure that only contents in the /projects/ path would be processed while
    Krwkrw crawler = new Krwkrw(action);
    crawler.skip("(\\S+)(/projects/)(\\S+)")
    
    will fetch and process all the contents except, the ones in the /projects/ path
  2. Ability to have random delays in between requests.
    Before now it was only possible to set the seconds to wait between requests. For example:
    Krwkrw crawler = new Krwkrw(action);
    crawler.setDelay(5) // waits 5 seconds between requests
    
    With the 0.1.3 release, it is possible to have random delay; that is, the requests will be delayed by number of seconds picked randomly from a lower and upper bound, for example:
    Krwkrw crawler = new Krwkrw(action);
    // waiting seconds will be any number between 5 and 20
    crawler.setDelay(5, 20) 
    
  3. Change in API. doKrawl method replaced with crawl
  4. Fix issue where it was possible for the crawler to crawl pages outside the origin url
  5. Some minor improvements here and there...
If using Maven as your build tool, you can add it to your project via:
<dependency>
<groupid>com.blogspot.geekabyte.krwkrw</groupid>
<artifactid>krwler</artifactid>
<version>0.1.3</version>
</dependency>

If using Gradle, then:
dependencies {
compile "com.blogspot.geekabyte.krwkrw:krwler:0.1.3}"
}

The Javadoc can be accessible online here.

You can also check out Krwkrw on Github

Sunday, September 06, 2015

What Does Changing Someone Else's Code Have In Common With Defusing A Bomb?


A while back, the above picture bobbled up my twitter stream...it was a morning of a weekday if I remember correctly...and as I was about to bop my head in agreement and push the retweet button: a gesture to show the picture also holds for me...It slowly struck me: that was not the case... this does not hold true for me...maybe it did before, but not anymore.

See, I understood the sentiments expressed in the picture so perfectly well. Either due to building on a codebase written by someone else (aka a former employee), or working on the same codebase with other developers. That apprehension of approaching someone else's code. That feeling of things blowing up, in unpredictable and unexpectable ways was just too familiar.