Monday, January 22, 2018

Playing with strings and performance in C#

A few weeks ago, I was solving a problem at HackerRank and after completing a solution that ran  under 100 milliseconds, the program would still timeout during execution. HackerRank limit is usually 1 second, so something was taking more than 900ms.

It was the strings!

Yes, we know handling strings is costly, but this time there was no way around it.  At the final stage of the problem I had to construct a string over 200k chars long from an array of Boolean values that represented the bits of the number.  

Disclaimer: Strings are slow, mainly because they are immutable. Concatenating two strings, creates a third one and possibly 2 for the garbage collector to pickup.

Looking through the code I noticed I was simply using "+=" instead of StringBuilder.Append method. After using StringBuilder the solution was faster and the submission was approved. It was  fast now, but it got me thinking "how much faster indeed?" which ended with me measuring the performance of concatenating strings, because we always knew StringBuilder was faster than regular operation with strings but I still wanted to see the numbers.

I used C# and tested the same set of examples in NetCore and full .Net Framework. I did not notice any performance differences between the two of them.

What was measured?

I measured 3 ways to concatenate strings, and each of them in reverse mode. Here are the 6 variants:
  • Using the + operator. (str += "1";)
  • Using + operator to insert in front (str = "1"+str)
  • Using string.Concat (str = string.Concat(str,"1"))
  • Using string.Concat to insert in front (str = string.Concat("1",str))
  • Using StringBuilder.Append to add at the end
  • Using StringBuilder.Insert to place in front
Every operation was ran against data sets of multiple lengths, for 10, 20, 30, 40 and 50k characters. The chart below shows the results in milliseconds. 

It was a surprise to see that StringBuilder.Append is not only faster, but appears to be in constant time O(1). Also, to see that the Insert method is for certain sizes slower than other operations, so StringBuilder is not always faster.  

Now if we run the tests for a data set with 100k characters, we get a new surprise: the insert method starts being faster at a bit over 50k chars in length, with the other operations degenerating in performance. 

If you like competitive programming or have an interesting in high performance solutions, make sure to always measure. Do experiments a lot in code and test different scenarios. In my case, I had an array reversed so I walked the array backwards and used the Append method instead of Insert. That was the difference between failing or passing a problem.   

Check the code for StringBuilder at GitHub. The implementation uses string.wstrcpy, while is blazing fast, this is still O(n). 

Sunday, January 14, 2018

New programming book published on Amazon

After a relatively long hiatus as a writer I finally completed a new milestone. I coauthored a programming book and published it on Amazon. I must say I am glad that we finally completed it; while it was a fun and rewarding journey, it was a long one and it took a lot of time, effort and coordination. 

The book is in Spanish, and it's 500+ pages long.

Title: "Empiece a Programar. Un enfoque multiparadigma con C#"
Format: Paperback

The title translates to "Begin programming. A multi-paradigm approach with C#". The book is written in Spanish, not only because all the authors are Cubans, but also because we believe that the Spanish-speaking community deserves a fresh new book; instead of a translated one.

We used C# throughout the book in order to teach programming, and we travel from the simple "Hello World" snippet to arrays, data structures, dynamic programming, recursion, inheritance, SOLID principles, functional programming and concurrency - among other subjects - across 17 chapters.  

Every chapter is loaded with code examples, detailed explanations and images to help the learning process. It's aimed to engineering and/or programming students, and even though the material is tailored for beginners, it can also be useful by more senior developers wanting to deepen their knowledge of programming.  

Here are the links to buy the book:

Tuesday, May 16, 2017

It's not RESTFul, but does it matter?

Quite too often we see discussion about what is and what is not REST, some developers excitedly discuss about REST purity. However, does it actually matter? Do you need that level of "purity"? I personally think that it does not and will try to explain why, although I am already convinced that will fail on my attempt.

REST means REpresentational State Transfer, and while almost everybody will tell you that is not necessarily related to HTTP (and technically isn't), in reality, it is, and that's a fact. So when we apply it to our Web Services, we have our new REST APIs or RESTful APIs mixed with HTTP flavors.

Also consider that they do not conform a protocol, or a specification, is an architectural style. So, technically, they are a set of conventions and guidelines and not an actual set of unbreakable rules (like a protocol).  

But they are great, RESTful, REST-like, almost RESTful APIs are all great. I have been writing and playing with APIs since around 2001, and came all the way from XML and SOAP (from the manual aspx, to asmx, to WCF) and then into JSON and REST , and I love them all. In fact, the only thing I don't like about REST is the people behind it trying to defend it so vehemently  :) But there are some issues I am sure you might have encountered before:
  1. Authentication: Purity zealots say that REST is pure and not HTTP-only, yet they use HTTP headers/cookies for authentication over a RESTful API.. wat?
  2. Sometimes it becomes really hard to do it the REST-style (and RPC-style feels natural) 
  3. The typical verbs are not expressive enough for my API (all real life systems are more complex than CRUDy examples) and this kind of ties to #2.

Rule of thumb: Never stop when they tell you   "it's not RESTful if..."

I found this online, no idea who the author is...

Lets shed some light over these 3 cases: 

Case #1

Authentication against a RESTful service: there should be no discussion, your system architecture will make things easier or harder for you, and your decision can't be to put the "purity" of an architectural style over the consistency, completeness and correctness of your system. I have seen scenarios where cookies bring transparency, and others where a token over query string parameters are a better fit. Don't stop a solution because someone tells you "it's not RESTful if...".  

Case #2  

REST is definitely good, but it is not enough, because not everything is a resource. While the idea of exposing documents, files, resources (nouns) feels natural for a huge amount of cases, you will eventually find a situation where is not natural to expose something as a resource. "Resources" and "Processes" can be really complicated in the enterprise. Think about it, we are translating the logic from our object and methods/functions, to basically one "object", whose status can just be CRUDily accessed. 

The REST-workaround to deal with this kind of issue is to "make it" a resource and then deal with its status. Let's say I have a process that can start/stop/pause/rotate/jump/recharge/sing (is my process, so the actions are whatever I decide :) ). One RESTful way to do this is to expose:

GET http://address/process/status 

as a resource. We could GET the endpoint to query and then perform PUT  with the new status every time:

   //other fields
and then starting, or recharging would imply sending the status over via a POST.

POST http://address/process/status

This does not feel natural, passing back and forth the whole status of a process to start/stop "actions". It might be a RESTy way to do things, but certainly it does not help me on this case. What if we have 100 actions? RPC would be a better option here.

POST http://address/process/<start|stop|pause>

We can do things like:

POST http://address/process/jump?howhigh=20

and we could still query the status by calling in a RESTy way:

GET http://address/process/status

There is no rule against combining them. Just go over you API and improve things as you go. Make things easier for you and the clients of your API. Blindly following guidelines for the sake of it, will yield no reward except suffering. 

Having a well documented API is more important than breaking your head trying to conform to some guidelines.

Case #3

The verbs are not enough. Lets take the data structure of a stack and try to represent it in a REST API:


would be my "stack" resource. The only way to modify a stack is by calling PUSH and POP operations, we don't have those verbs, but we can try and see if they fit our RESTful idea. I also need a COUNT, to know how many items are on the stack, and a PEEK, to read the one on the top without modifying the collection. Care to make this in a restful way?

GET http://address/stack

What should this do? To be RESTful, this should just return the whole stack, which is not very useful for what you typically want a stack (could apply same logic to a "queue"). And would be a terrible way to get the COUNT, and I really want the O(1)  order of some operations over a stack.  

POST http://address/stack

This could be our PUSH, sending an object to the top of the stack. What about POP?

DELETE http://address/stack

This makes no sense, because we are not deleting the stack itself. Then even something like /stack/top, would also be incorrect, since the element at that URI would be different after a delete, and finally because DELETE must be idempotent. 

Then what? POST? We already used it. Once you reach this point, is better if we just go back RPC again.

GET http://address/stack/count  (COUNT)
GET http://address/stack/top    (PEEK)
POST http://address/stack/push  (PUSH)
POST http://address/stack/pop   (POP)

POP would "modify" the stack, so, a POST is the catch-all verb to use here. DELETE wont do it and PUT even less, they are both idempotent. 

All these can be also applied to a queue, a circular queue or many other data structures. The verbs might not be enough to support your operations from a semantic point of view, and URIs might not be unequivocally representing a resource. /stack/top, its not a permanent URI for "the element" on the top, is the address of the "top of the stack", the element itself can and will vary over time.

I have a hard time trying to understand how RESTful could be a silver bullet without the ability to represent the most simple of structures. REST and CRUD-style is not enough for everything out there.

Tuesday, May 17, 2016

JustMock and "Failed to initialize coreCLR" on RC2

Unable to start the process. Failed to initialize CoreCLR, HRESULT: 0x80131500

I just installed ASP.NET Core RC2 and was able to get it working from the command line, on some new basic tests projects. But when I tried to do it from Visual Studio I got the following error. 

If I tried to run "dotnet" in a console opened from Visual Studio, I would get the same error. So the issue was that something was off for the VS configuration/environment. 

After verifying that "dotnet" was able to run perfectly everywhere else I took into the task of comparing the environment variables (dichotomically, of course). One particular variable made the difference: "JUSTMOCK_INSTANCE=XXXXX", so that lead me to notice that JustMock was interfering with it somehow. 

So, if you use JustMock just turn it off for RC2 projects. (No need to uninstall it, just disable the profiler)

Tuesday, May 03, 2016

NODE.js fs.readFile and the BOM marker

Working on a ReactJS project, got a small glitch while trying to read a JSON file and parse it as a javascript object.
  getDataObject() {
        var dataString = fs.readFileSync("./json/data.json", "utf8");
        return JSON.parse(dataString);
  SyntaxError: Unexpected token  

After checking, double and triple checking again that the file was correct I went deeper and realized that "fs.readFileSync" was returning me the BOM for the UTF file at the beginning of the string. So we just need to strip it out.
   getDataObject() {
        var dataString = fs.readFileSync("./json/data.json", "utf8");
        return JSON.parse(dataString.replace(/^\uFEFF/, ""));
I then checked and there are "packages" for this, however I don't think is proper to just import a new dependency just to fix something this small.

The big issue here is why is this considered correct. The BOM is not part of the string, it's a marker used (and also optional) to aid in how to read the content, but it is not part of the content.

Friday, December 04, 2015

Red blinking issue on Microsoft Display Dock

Red blinking light? Check the power supply cord.

I just got my Microsoft Display Dock courtesy of MS after pre-ordering a Lumia 950XL. But after the first attempt I just got a red blinking light on the front and nothing more. The phone won't recognize it and the display connected would just go to sleep.

After playing with it a bit I realized the USB-C connector on the dock's power supply (identical to the "fast charger")  was a bit loose. I then compared to the one that came with the 950XL and voilĂ ! See the difference below.

Left: Adapter from the 950xl box | Right: Adapter from the Display Dock
Since it is shorter (manufacturer error? bad batch? who knows...)  the adapter packaged with the Display Dock won't reach the connector properly. However, it will fit in and charge the 950XL perfectly. I think one of the reasons is the thick casing in the Microsoft Display Dock. Check it out.
The case is sturdy and thick.

So if you happen to have a Microsoft Display Dock, then you already have a 950/950XL that most likely came with a "fast charger" (why else would you get the accessory?). You can either switch them or return the dock to Microsoft for a replacement.

However the chargers are identical and you can also fast charge while connected to the dock.

Thursday, October 08, 2015

Why this new Microsoft is impressive

The last Microsoft event on October 6 showed a few surprises, let me explain you why this is impressive and why I think the new Nadella direction at MS is different. Iris Scanner, Liquid Cooling, Continuum...

A new Surface 4 tablet with Liquid Cooling
The new Surface 4 was a really good upgrade, and a real big one. More resolution in the same size, up to 16GB of RAM and 1TB of HDD and Liquid Cooling. All that packaged in a device as think as this, only 8.5mm of depth.

Check how the cooling works.

New phones with Continuum and Liquid Cooling

I know Windows only have 3% of the market on phones, and that's a bummer because we don't get to have all the apps. But now the new phones have continuum, a feature that allows using a phone as a full size computer. They also manage to get liquid cooling into a handheld device. That's a good direction. Let's just hope this new direction combined with universal apps will get more developers into the boat.

A new Surface Book.

Think got even crazier when MS announced this product, I think is safe to say that nobody saw it coming. A new powerful laptop hybrid (although expensive) and a crazy beautiful product. Looks like Microsoft is finally putting good design in combination with good specs in an effort to gain the consumer market.

That's impressive, right?

A new line of products, beautiful and powerful. New technologies like continuum, iris scanner and liquid cooling in devices so thin and small. New redesigned products like the new Band or the new Pen for Surface. This is a new MS, way different than the one that created the Zune. 

PS: if you reading this from CodeProject then the CP crawler is still buggy and pulling posts without the CodeProject tag.