Writing No-Framework ASP.NET (part 1: encodings)

For the past week, I was writing some code in ASP.NET that I wanted to run on a Microsoft IIS server. Here I want to post what I learnt and what the structure of my code looked like.

First, my requirements.

  1. I didn’t want to code a full web application. All I needed to do was to add a few dynamic elements to otherwise static web pages written as HTML files.
  2. I wanted to modify the existing pages as little as possible.
  3. I didn’t want a solution that would require installing plugins to the IIS server. This meant that I couldn’t use PHP and that I had to code in ASP.NET.
  4. I didn’t want to learn a lot to do this. Having to use an IDE or having to learn a framework was completely out of the question.

These are really simple requirements. However, the solution turned out to be quite complicated ranging from code reuse to character encoding. The following is an outline of what I will discuss.

  1. Understanding how ASP.NET handles source file encoding. (part 1)
  2. Basic ASP.NET web page (.aspx) structure. (part 2)
  3. Making Visual Basic function calls as terse as possible in the view code. (part 2)
  4. Ways to reuse code in ASP.NET. (part 3)

I’ll start with understanding source file encoding and then describe the others in separate articles.

The .aspx files

The simplest way to code ASP.NET web sites is to use ASP.NET web pages; .aspx files. These are similar to PHP .php pages and you can include both code and HTML markup into a single file.

The .aspx files are compiled before being run in IIS.

An important thing is that all strings in .NET are represented in Unicode. Strings that are not in Unicode are not allowed. This is very different from PHP. In PHP, strings are a simple collection of bytes. It is up to the programmer to keep track of which charset each string is encoded in.

This means that any .aspx files that are not in Unicode have to be charset converted from their original coding on compilation. This includes all the hard-coded HTML strings. Also, if the output from the server is going to be non-unicode, we have to do that conversion (internal Unicode to output encoding conversion) as well.

Example of how encoding of an .aspx file would happen

Since a large number of websites in Japan still use Shift-JIS encoding, let’s assume that we are working with a Shift-JIS encoded website. Hence the source files that we want to add dynamic ASP.NET code are in Shift-JIS. We also want the HTML output to be in Shift-JIS so that all web pages on the site (both static HTML pages and .aspx pages) have the same charset.

In this scenario, character encoding conversion of .aspx files would happen in the following manner;


Shift-JIS encoded .aspx source files are converted to Unicode on compile

Code inside the .aspx source files is run as Unicode

Output is converted back to Shift-JIS

With this in mind, let’s look at how to configure stuff to ensure that the encoding happens correctly.

ASP.NET configuration hierarchy

Before going into the charset configuration, I want to briefly touch on how ASP.NET web applications are configured. The configuration hierarchy is quite complex. Compare this to PHP where you basically have one php.ini file to configure all PHP instances, and the Apache .htaccess file where you can put additional settings. In both ASP.NET and PHP, you can additionally change settings inside the application (.aspx or .php).

However, even with ASP.NET’s complex hierarchy, you will probably only have to worry about the web.config file. You basically place a web.config file at any location in your web applications file hierarchy, and that file will change the setting for that directory and any subdirectories. It works like Apache .htaccess.

Telling ASP.NET what charset the source code files (.aspx files) are encoded in

Configurations for encoding are set in the globalization property of a web.config file with the fileEncoding attribute.

[xml]

[/xml]

If fileEncoding is not specified in the configuration hierarchy, the system encoding of the server is used. For machines running a Japanese OS, this would be Shift-JIS.

A list of possible encodings is provided by Microsoft.. For Japanese encodings, we have “utf-8” (code page 65001), “shift_jis” (CP932: code page 932) and “EUC-JP” (code page 20932). These encodings are not pure Shift-JIS or pure EUC-JP but have extensions for Windows.

Now that I’ve talked about how to set the ASP.NET configuration for fileEncoding, let’s see how this affects charset conversion.

The rules are as follows;

  1. If the .aspx source code file has a Unicode BOM, then the file is considered to be in the Unicode encoding as described in the BOM.
  2. If there is no BOM, then the file is considered to be in the encoding as configured in the ASP.NET settings (i.e. web.config, etc.).

Telling ASP.NET what charset the HTML output should be encoded in

ASP.NET internally manages the .aspx file contents in Unicode, and converts them to the responseEncoding before it sends the response to the client browser. ASP.NET also sets the “Content-type: text/html; charset=???” HTTP header to responseEncoding.

ASP.NET however does not set the <meta charset=???> tag inside the HTML <head> element. You have to manage this yourself.

ASP.NET uses the following locations to set responseEncoding.

web.config

[xml]

[/xml]

The @ Page directive
.aspx files contain a @ Page directive to set page-specific attributes. You can set responseEncoding here with the following syntax;

[vbnet]

[/vbnet]

The Page object
You can also set responseEncoding on the Page object directly in code.

[vbnet]
Page.responseEncoding = 932
[/vbnet]

Telling ASP.NET what charset the request parameters are encoded in

In addition to setting the charset of the source file (fileEncoding) and setting the charset of the HTML output (responseEncoding), ASP.NET has another charset that you can specify. That is the charset of the request (reqeustEncoding).

This setting affects how the query-string data and the data coming in from POST requests is interpreted by the ASP.NET server. You set this in web.config like so;

[xml]

[/xml]

The default value is “utf-8”.

The charset used by browsers to send queries and post data is a complicated issue. Ruby-on-Rails for example, adds an extra parameter to ensure that all data is in UTF-8.

Microsoft’s documentation suggests that reqeustEncoding should be set to the same charset as responseEncoding for a single server applications. Of course this depends on how ASP.NET servers work, but in general I don’t think this is a good idea. I think reqeustEncoding should be set to “utf-8” regardless of the responseEncoding (the charset of the HTML output), and this is also how Ruby-on-Rails does it.

Encoding settings for a Shift-JIS encoded website

Let’s go back the settings that would be required if we were working on a Shift-JIS encoded website. The requirements are;

  1. .aspx source files are encoded in Shift-JIS.
  2. HTML output is in Shift-JIS.

Then the web.config file should look like

[xml]

[/xml]

I arbitrarily set requestEncoding to “utf-8”. This is how I would set up the system, but it really depends on how your server decodes requests. It does not affect the HTML output from your .aspx files.

Summary

This was the first of my series on working with ASP.NET. It dealt with how ASP.NET handles source file encoding. ASP.NET has to encode the whole source file (.aspx files) into Unicode even before a programmer touches the code, and that is why you need a setting. This is also why you have to specify the output encoding.

PHP doesn’t meddle with string encodings in the source files. Encodings are performed on a per-function basis and the programmer is responsible for managing conversion. In practice, programmers will convert request parameters and output similar to ASP.NET. However, programmers will seldom touch the hard-coded HTML strings in the source files. The idea of converting the encoding of hard-coded HTML is quite surprising and it was a shock that ASP.NET does this in the background.

The ASP.NET way is not inherently a good or bad idea, but it can cause issues when you are simply adding dynamic content to a pre-existing website. You need to make sure that your settings aligns with the encodings and the workflows of your colleagues who might edit your .aspx files with various editors.

Coming from a web-development background, the ASP.NET way is certainly alien.

Other articles in this series

  1. Understanding how ASP.NET handles source file encoding. (part 1)
  2. Basic ASP.NET web page (.aspx) structure. (part 2)
  3. Making Visual Basic function calls as terse as possible in the view code. (part 2)
  4. Ways to reuse code in ASP.NET. (part 3)

Notes on Character Encoding Conversions

I did a quick bit of research on Japanese character encodings and how functions in PHP handle the conversions between them.

The table below summarizes the results (click to enlarge).

スクリーンショット 2014-03-07 16.11.06

We can see the following;

  1. Although Shift-JIS (SJIS) is still the most common format in Japan, it is terrible at handling special “hankaku” (single-width) characters. It simply leaves out a lot of them; even the ones that we would like to use quite frequently.
  2. The PHP mb_convert_encoding function gives up when it can’t find a matching character, and deletes the character. On the other hand, iconv does a pretty good job of finding a good substitute if we specify //TRANSLIT.
  3. Gathering from webpages that I can find on the subject, a lot of people seem to prefer mb_convert_encoding with the sjis-win encoding. This is a lousy solution if you are using special “hankaku” characters. It’s better to use iconv with CP932 encoding and //TRANSLIT. There is one snag with CP932 encoding with //TRANSLIT and that is with regards to the “hankaku” yen character (“¥”). Converting to “yen” isn’t really a nice solution. You can see however that //TRANSLIT always converts to ASCII, and “yen” probably is the only way you can sensibly convert the ¥ mark. Otherwise, it’s a good idea to use the “zenkaku” (double-width) “¥”.
  4. The micro mark “µ” is not supported in Shift-JIS but the greek mu “μ” is. Therefore, if you want to write a micro mark in Shift-JIS, you should use the greek mu instead. Again, iconv with //TRANSLIT does the correct thing (converting it to “u”).

Beware of Character Encoding during Cut & Paste of Websites

The issue is very simple;

Do not assume that a website written in “Shift-JIS” will only contain characters that can be represented in “Shift-JIS”.

If you copy any characters that cannot be represented in “Shift-JIS” and paste them to another web page also coded in “Shift-JIS”, it may generate garbled-text (Mojibake).

For example, you may copy text from a website coded in Shift-JIS which contains &trade;, &copy; or &reg; to code , © or ®. These characters are not available in Shift-JIS. If you paste them in your webpage which is also in Shift-JIS, you will see garbled-text or a “?” sign (Mojibake).

Another example is if the page has an element that is loaded via Ajax. The Ajax payload will be handled by Javascript, which will handle the payload as Unicode. It is capable of inserting characters into the DOM that are not representable in Shift-JIS.

The thing to keep in mind is that the HTML character set is only a transfer protocol. It does not govern or limit in any way the text that can be displayed on the browser. Hence you cannot assume that all the text that you see on the browser is encodable in a particular encoding except Unicode.

What Are “Services”

There is a lot of discussion on the Internet about how “services” are essential to tech companies.

Ben Bajarin recently raised the point that even though Google is using their services as a weapon to fend off the proliferation of AOSP (Android Open Source Project) devices, Google’s services are actually only relevant in markets like the US and UK, but much less so in other regions.

What you see with regard to the Google Play services availability is the biggest issue facing Google. It is one that is forcing, in a good way, local companies in those regions to create and bring to market services of their own to support their region. China is the best example of this do date. Granted China’s Android ecosystem is a bit messy with over 100 different app stores but the region is quickly fixing these issues and consolidating.

The fact that Android is being used as an open source platform is not necessarily a bad thing for Google. What is challenging is that they are not making the impact with their services the way they need to be in many of these regions. Their competition in this case is not from the likes of Apple or Microsoft necessarily but from savvy startups looking to solve a problem in their region and doing it better than Google can thus keeping Google out of regions they may wish to compete.

I totally agree with Ben’s argument, but I would also suggest that what we are simply calling “services” should be broken down into certain sub-categories. For example, looking at the Wikipedia table on Google Play availability, we see that “paid apps and games” are available in the majority of countries, whereas books, movies and music are not. Compared to the same chart for Apple’s iTunes store,, Google Play is extremely lacking in books, movies and music but not very different in apps. This suggests that digital distribution of apps is a very different business compared to that of books, movies and music.

The reason why there is a large difference is rather obvious. In the case of apps, Apple and Google are the gatekeepers. They do not have to negotiate with the content owners over whether they can distribute the content in a certain country and at what prices. They make the decisions or the developers make the decision when they submit the app.

For books, movies and music, the rights to distribute content are much more complicated. The content owners have much stronger bargaining power and they often have different agreements in each country. Each country may have their own distributor network which may have exclusive rights for distributing content in that country. Furthermore these distributors might have plans for their own digital distribution which would compete with what Google and Apple are planning to offer.

Hence the difference between Apple and Google Play is most likely the difference in negotiating power, skill and previous relationships with the content owners. Essentially, it boils down to the ability to make deals.

With this in mind, I propose that we break down “services” into the following;

  1. self-owned services: These are the services where the provider has ownership of the content. Examples are search, social network services and web-based services (Google Apps, etc.).
  2. self-controlled services: These are the services where the provider can distribute without negotiating with a strong content owner. The prime example is apps. App vendors are generally quite small and have little bargaining power relative to the service provider.
  3. third-party owned services: These are the services where you are selling content that is owned by a third-party, and that third-party has strong negotiating power over distribution (unlike in the case of apps). Examples are music, books, movies, etc. Distribution of this content was historically done physically through retail networks and this resulted in complex networks and agreements, which are often different in each country. Also this content tends to be much more expensive to create than “self-owned service” content, thus requiring large companies to fund production. These large companies obviously have strong negotiating power.

When we map companies like Google, Apple, Amazon, Twitter, Facebook, Spotify, and Pandora to these categories, we find that no company is strong in all three. Twitter and Facebook are exclusively in the “self-owned services”. Amazon, Spotify and Pandora are exclusively in the “third-party owned services”. Google is mostly in the “self-owned services” and to some extent in the “self-controlled services”. They are however very weak in “third-party owned services”. Apple is strong in “third-partly owned services” and strong in “self-controlled services”. They are however weak in “self-owned services”.

From an international perspective, “self-owned services” and “self-controlled services” are relatively easy for the service provider to provide in many different countries. However, “third-party owned services” are very difficult. Amazon for example has very limited international reach. The fact that Apple has in fact been able to provide their services in a large number of countries is very much the exception.

These three categories will probably have very different dynamics and I sense that it will be very difficult for any single company to excel in all of them. At least that seems to be the case so far.

Google Plus is an SEO Tool

There was a good article on the New York Times about Google’s spooky social network, Google Plus.

Some quotes from the article;

Thanks to Plus, Google knows about people’s friendships on Gmail, the places they go on maps and how they spend their time on the more than two million websites in Google’s ad network. And it is gathering this information even though relatively few people use Plus as their social network. Plus has 29 million unique monthly users on its website and 41 million on smartphones, with some users overlapping, compared with Facebook’s 128 million users on its website and 108 million on phones, according to Nielsen.

Starbucks, for instance, has three million followers on Plus, meager compared with its 36 million “likes” on Facebook. Yet it updates its Google Plus page for the sake of good search placement, and takes advice from Google representatives on how to optimize Plus content for the search engine.

“When we think about posting on Google Plus, we think about how does it relate to our search efforts,” said Alex Wheeler, vice president of global digital marketing at Starbucks.

“You might not need jQuery”

I stumbled on the You Might Not Need jQuery website, and I think that it’s a fantastic idea.

What it does is that it compares code written using jQuery, and code written in plain Javascript. If you develop mainly in jQuery, this helps you to write the same code in plain Javascript. On the other hand, if you are like me and don’t like jQuery for any reason, then it’s a good resource to learn from other people’s code.

It also links to some good libraries that can be used independently of jQuery.

Personally, I find using jQuery quite annoying for the following reasons.

  1. It can noticeably slow down page loading, especially on mobile. It even slows down PCs with Core i5 processors by 100ms.
  2. The jQuery website boasts that it’s only 32kB minified and gzipped, and they call it lightweight. On the contrary, the Ponzu system that I’m developing, which uses Javascript for AJAX, hashtag-based navigation, localStorage-based page caching, JSON-driven HTML templates and more, is less than 20kB total (minified and gzipped). It’s hard to justify 32kB when the vast majority of code is not going to be used.
  3. There are often too many functions doing similar things (and I’m saying this coming from Ruby, which also has a lot of redundant functions). Event handling is especially an area that put me off.

In Ponzu, we use jQuery only if the client is Internet Explorer. We have a small number of shims that use jQuery as a compatibility layer.

How Many Companies Use Lotus Notes

In understanding the process of innovation and technology diffusion, it is important to analyze how long it takes for an outdated and unpopular technology to actually be eradicated from the market.

That is why I am interested in knowing how many companies use Lotus Notes there days.

I’m having difficulty finding credible information, but what I’ll find, I’ll post here.

From the salesforce.com blog;

Well, the reality is that Notes penetrated companies pretty darn well back in the 90’s (like a Nirvana song permeated the radio waves), and the departmental applications sprouted and filled all the holes that IT often couldn’t get to. Love it or hate it, Notes became a mainstay platform of the enterprise. In a recent survey we did of our Dreamforce 2012 attendees, we found that 73% did indeed still use Lotus Notes. And that 70.3% were considering replacing Lotus Notes, the majority within the year.

From an old source but which mentions that companies might not be using Lotus Notes for Email, but for other stuff, which would make a market share comparison rather difficult;

Jim goes on to explain, that by a wide definition of “use Lotus Notes and Domino software” even Microsoft would be a Notes customer.

How Are iPads Actually Being Used in the Enterprise?

There is a lot of discussion on how tablets (iPads) are replacing PCs. I have been generally skeptical of this view based on tablet usage data (1, 2).

The discussion for tablets replacing PCs is generally based on the decline of PC sales coinciding with the rise in tablet sales. This is true. However, there is little discussion on cause and effect. It is totally possible that these sales trends are not strongly related; they may simply have happened at the same time by coincidence.

Also, there are many tech bloggers and analysts who claim that they have managed to get by on their iPads alone, and only using their PCs very rarely. Or some people will claim that their parents have simple needs which are completely covered by an iPad. I have no reason to doubt these arguments, but on the other hand, I have very little reason to believe that the majority of users, especially in corporations, would feel and act the same way.

What is sorely missing in the vast majority of discussions, is how corporations are actually deploying iPad. Things like to following;

  1. How many people in the organization are getting iPads?
  2. What are iPads being used for by which people?
  3. Do the people who use iPads stop using their PCs?
  4. How do the iPads integrate with the preexisting corporate IT setup?

We can only reach a good idea of the potential market size of corporate tablets if we carefully analyze these points.

A few days ago, an article was published on ITMedia (a Japanese IT publication) that described how and why a large company introduced iPads into their IT infrastructure. I thought that it was very insightful and I have listed some points below. It tells us what iPads are good for, and importantly, why they limited distribution to only their managers and executives.

  1. The company is Mizkan, a food company that has been around for 210 years (a history almost as long as that of the United States of America). This company has 2,900 employees and a revenue of 170 billion yen (~1.7 billion USD).
  2. They have been using IBM Lotus Notes/Domino within their IT infrastructure since 1996.
  3. One main function of the Notes system was workflow management. Since their business involves products that can directly damage customer’s health, accountability is key. They need to have a strict approval process.
  4. The managers who are responsible for the approvals are often on the road, who are often not able to frequently open their laptops. This led to delays in the approval workflow.
  5. They installed “Lotus Notes Traveller” into iPads together with some custom applications designed to work together with Notes. These iPads were handed out to the managers and executives who were responsible for approvals.
  6. As a result, they were able to significantly reduce the time to get approvals from all concerned executives and managers.
  7. Some executives have expressed that they don’t take their PCs around anymore and that the iPad is sufficient when on the road or at home.
  8. Importantly, Mizkan has no plans to introduce iPads to their lower-level office workers. This is because whereas executives rarely have to prepare documents themselves, normal employees have many jobs which use keyboards extensively. Mizkan predicts that normal employees will not be able to complete their tasks on tablets alone.

My takeaway from this article is the following;

  1. Corporate IT has many more functions that email, document/file sharing and project management. There functions are already provided by legacy solutions.
  2. The new generation devices (smartphones and tablets) are not going to replace corporate IT infrastructure overnight. Instead, they have to integrate with the current systems. This means integration with Lotus Notes, Microsoft Exchange and all the other solutions that corporate IT have accumulated.
  3. The majority of workers in the office are going to stick to PCs. Hence PCs will most likely remain in the center.

Will tablets never replace PCs? I don’t necessarily think so. I think they eventually will. But I think it is increasingly important to reflect on Steve Jobs’ own words as he introduced the iPad;

  1. Better at browsing the web than a laptop.
  2. Better at Email.
  3. Better at enjoying and sharing photographs.
  4. Better at watching videos.
  5. Better at enjoying your music collection.
  6. Better at playing games.
  7. Better at reading eBooks.

If there is going to be a third category of device, it gonna have to better at these kinds of tasks than a laptop or a smartphone. Otherwise, it has no reason for being.

Extending Steve’s discussion, if the iPad is going to replace the PC, it’s gonna have to better than a laptop at current corporate IT tasks.

That’s a pretty tall order.

2013 Smartphone Sales Decreased in Japan

MM Research Institute (MMRI) recently published a couple of reports (1), stating that in Japan in 2013, smartphone shipments decreased by 3.7%. This was due to a combination of the following factors;

  1. Total mobile phone shipments decrease by 10.2%.
  2. Smartphone penetration is nearing saturation at roughly 45% of total mobile phone subscriptions.

Smartphone saturation

Observer the following graph from MMRI. This shows the number of subscribers. Blue is for smartphones and pink is for feature phones. The last bar is for Dec. 2013.

You can see how smartphone penetration is saturating. The current smartphone penetration is 44.5% and it looks like it might stop at 50%.

NewImage

Additional information from the report;

  1. 52.4% of feature phone owners answered that their next purchase would be a feature phone. Only 34.4% said that their next purchase would be a smartphone.
  2. Reasons for not purchasing a smartphone include a) pricey data plans, b) no need for the additional features, c) difficulty of use.
  3. Smartphone users average 6,826 JPY per month whereas feature phone users average 3,746 JPY per month.

In interpreting this data, you have to understand that Japanese feature phones are pretty capable. They can do email (even email to/from PCs), surf mobile web sites (and there are many of these in Japan), play music, watch TV, take photos, play games and make NFC enabled purchases. You can even use LINE, the explosively popular messaging app although features are limited.

Also, virtually all smartphone data plans in Japan are unlimited data. There are some pay-as-you-go schemes but you quickly reach the ceiling after which your plan actually becomes the same as an unlimited data plan. Pre-paid plans are rare.

On the other hand, feature phone typically do not need data plans to access email or watch TV. A cheap voice plan is sufficient. You can subscribe to a data plan if you want to surf the mobile web or do more complex stuff, but I suspect that most of these users are now using smartphones.

Smartphone sales decline

MMRI data for 2013.

  1. Total mobile phone sales decreased by 10.2%.
  2. Smartphone sales decreased by 3.7%
  3. Apple garnered 32.5% (+9.2 points vs. 2012) mobile phone share, or 43.6% of smartphone share.
  4. Other vendors are Sharp (14.6% share), Sony (12.6% share), Fujitsu (9.7% share), Kyocera (8.8% share), Samsung (5.9% share)
  5. Percent of smartphones sold vs. total mobile was 74.1%.

Combining the subscriber base (44.5% on smartphones) to the annual sales (74.1% smartphones), it is clear that feature phone users are clinging on to their old models. This is probably because R&D on feature phones has ceased and no new features are being added. Additionally, carriers are not promoting feature phones.

Implications for countries outside of Japan

What this data means is that around 50% of Japanese mobile phone subscribers do not need the high-end features of smartphones, and would be satisfied with email and voice. They don’t need Facebook or LINE on their phones (although they could if they paid for a data plan). They just need a convenient way to communicate.

Now assuming that we can apply this 50% number to other countries. Since these countries do not have the feature-rich feature phones that the Japanese have enjoyed for more than a decade, we can assume that low-end Android phones on pre-paid plans are being purchased instead.

What I am trying to say is that although U.S. smartphone penetration is now at 64%, which is significantly higher than the Japanese 44.5%, a large proportion of this number probably includes subscribers on cheap pay-as-you-go or pre-paid plans. These subscribers may be using their smartphones in a manner that is similar to Japanese feature phone users, hence including them in smartphone market share is potentially misleading.

In other words, U.S. smartphone penetration may be significantly higher than Japan but the way that people are using mobile phones in general might be much more similar.

Smartphone penetration is not the right metric

Instead of looking at smartphone penetration, I propose that we should be looking at data consumption. We should be looking at what percentage of the subscribers use their smartphones to use services over the Internet thereby consuming lots of data, and what percentage use it only for voice and simple messaging. Instead of looking at the hardware, we should be looking at how people use them. If data consumption data is hard to obtain, we should be using their data-plan (unlimited, postpaid, prepaid) as a proxy.

Similarly, we should be looking at how many iPhone users consumer lots of data and how many Android users consume lots of data.

In other words, at the low-end, Android is not a smartphone platform. It is a platform upon which vendors build a feature phone.