In this study, web server softwares (i.e. Apache, Microsoft Internet Information Server) used on http servers located in Turkey were analysed. The goal was to collect information about the Turkish Internet infrastructure as well as to obtain data regarding the trends of open sourced software usage in Turkey.METHODOLOGY
This study would not have been possible without the generous support of TÜBİTAK ULAKBİM. I sincerely thank the ULAKBİM team and Mr. Onur Temizsoylu from this team for their support.
IMPORTANT NOTE: This study is a result of personal interest. Neither the supporting organisation(s) nor the company I am currently working for can be held responsible for it. Because of the way this research was conducted, the capability of the tools, and the way the results were analysed, mistakes may have been made. Information presented here is not guaranteed to be correct. The results need to be interpreted by considering the methodology and the potential pitfalls. This study can be not be used without reference to this page hence to the methodology and to this notice.
Internet traffic passing through the Turkish Academic Backbone had been analysed for one month and the IP addresses from autonomous systems in Turkey that were generating http traffic were identified. These IP addresses were then tested by using the httprint tool and the server responses were analysed. These tests had been performed twice at different dates and only the web servers that responded to both requests were taken into consideration. While answering to a request, a web server usually sends information about the web server software it is running. Some servers obfuscate this information for security concerns. This research primarily relies on the information returned by the web server software itself. In cases where this information was not available or was obfuscated, guesses made by the httprint tool were used.
In this study, IP addresses were used instead of domain names for web server identification. This is in contrast to the method used in previous publications where domain names were used for web server identification. The reason for choosing this approach was the impossibility of identifying web servers in Turkey by doing a domain name analysis only. The net result of using this approach had been to reach every individual IP address/server only once. Because of the flexibility of the http protocol, it is possible to bind hundereds, even thousands of domains to a single IP address. Without the actual knowledge of the domain names, it is not possible to identify the number of domains that a server hosts on a single IP address. Therefore, every IP address was taken into account only once no matter how many domains it hosts. Servers having more than one IP address may have been included in the results for every IP address that they have.
It is fair to expect that using a server list based on the academic backbone traffic had some effects on the results. For a server to be included in the web server list, there had to be http traffic between an endhost in the academic network and the server; or if the server is located in the academic backbone, there had to be http traffic between any endhost and that server. Considering that there are almost 300 thousand users in the academic backbone it is arguably possible to assume that any active web server in Turkey would have been identified by this approach within the one month period. However the likelihood of identifying not so active web servers would be higher if they were located in the academic backbone. Idenfication of such not so active web servers outside the academic backbone also depends on the profiles of the users of the academic backbone. This also needs to be considered while looking into the results.
The total number of identified IP addresses that responded to http requests were 10.088 All the analyses were made based on these responses. Overall web server preferences are as follows.
1. Web servers like Microsoft Internet Information Server (IIS from now on) and Apache which are very common for web hosting environments most probably host more than one domain on a single IP address. This study does not take this into account. Figures here represent the analysis made on an IP address/server basis as explained in the methodology.
2. Cisco web server mentioned here is the web server used on different Cisco devices (Cisco IOS, Cisco PIX etc.). In almost all cases a Cisco web server is solely used for administrating the device in question. While it is interesting to see such a high percentage of Cisco web servers, these seem to have resulted from web accessible Cisco devices. Hosting more than one domain on a single IP address, which is especially valid for IIS and Apache, will certainly not be the case for a Cisco web server. However, because of the way these systems operate, almost in all cases each device will have more than one IP address. These IP addresses may have been included in the tests seperately. Unfortunately there is no reliable way to prevent this.
3. RomPager and Nucleus web servers seem to be used for the web management interfaces of DSL modems and routers.
Operating Systems used for Apache
When a web server responds to an http request, it may provide additional information in addition to the web server software (server version, platform that it is running on etc.). For instance, a standard apache configuration includes the apache version and the operating system (OS) information in an http response. Operating systems of the identified apache web servers were classified by using this approach, the results are presented below.
The server information that an apache server sends in an http response is administrator configurable. Therefore it was not possible to obtain OS information from all the apache web servers in question. Several other factors, like the packaging system of the OS, whether apache was manually compiled also have an influence on this response. This is one of the reasons why the percentage of "Other" category has been so high.
1. "Unix" category specifies that an apache server runs on a unix (Solaris, FreeBSD etc.) system. An apache server which is manually compiled on a linux system will also show up in "Unix" instead of "Linux" category.
2. Most of the systems that are in the "Other" category are those which have not provided any information but the server name in the http response. Most of these servers most likely run Unix, Linux or Windows but could not be classified in the absense of relevant information.
In most cases, an apache web server that is prepackaged into a linux distribution responds with an additional linux distribution information unless specifically configured not to. Distributions of the identified linux servers have been classified in this way, results are presented below.
It is important to note that the above approach is only applicable to distributions that have a packaging system and packages for the apache web server. Cases where a linux distribution did not have a package management system or apache was manually compiled could not be taken into consideration.
1. Mandriva and Mandrake are counted together.
Versions of the identified apache web servers are presented below.
1. Other category represents the apache servers whose version could not be identified.
Microsoft Internet Information Server (IIS) Versions
IIS only runs on MS Windows platform. Therefore, it is safe to assume that all 5614 IIS servers are running on MS Windows. Classification of IIS servers' versions is presented below.