Minimum Total Loss Rounding in Python

Minimum Total Loss Rounding algorithms are quite useful for distribution problems if you need whole numbers after the distribution, as the final result of distribution.

Download Python script/module (version-1)

For example, you want to distribute a certain number of students to a number of available classrooms, approximately in proportion to the ground area (measured in square meter) of each classroom:
Distribute 180 students to 5 classrooms with given ground areas [25, 30, 20, 15, 35] Sum of all given areas = 25 + 30 + 20 + 15 + 35 = 125 = sum([25, 30, 20, 15, 35])

If you distribute students to classrooms you will obtain following fractional numbers:
areas = [25, 30, 20, 15, 35] d1 = [x*180.0/125.0 for x in areas] ([36.0, 43.2, 28.8, 21.6, 50.4])

Because there can’t be 43.2 or 28.8 students in a class, we need to round all these fractional numbers with minimum total loss, that is, with minimum total deviation from the original fractional number. Continue reading

Posted in Uncategorized | Tagged , , , | Leave a comment

Travel and Assignment Planning Software in Python

Software Architecture

Software Architecture

From 8/2015 to 5/2018 I worked for a Swiss company named ETC Transport Consultants as senior data analyst and developer on a part-time basis (60%). Within this time period, in less than three years, I developed a complete solution for Travel and Assignment Planning in Python -with the help of ETC team of course- that accomplished following tasks in an integrated process:

  1. Top-down distribution of measurement requirements (see introduction)
  2. Travel Planning (based on 60+ logical search conditions & constraints)
  3. Assignment Planning (based on 20+ assignment conditions + nonlinear/discrete optimization)
  4. Export of results into excel files (reporting)

You may download my official work certificate from ETC (pdf file) which provides a good overview about this development project.

Travel Planning

Planning Process

Planning Process

Searching, evaluating and sorting optimal tours (with public transport vehicles like train, tram or bus) that are found according to given set of search conditions in planning software…

The planning software is based on central timetable data (HAFAS data) for Switzerland, that include information like departure and arrival times for every transport line. In the simple central architecture of this planning software there are Python modules and PostgreSQL database containing the timetable data, and nothing else. Continue reading

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Top-Down Distribution of Quality Measurement Requirements using Table Functions in Python


Since August of 2015 I have been working for ETC Transport Consultants1 in Switzerland as Senior Data Analyst and Developer on a part-time basis. My job is developing a complete software solution for Travel Planning & Resource Optimization in Python (a popular high-level programming language).

ETC Transport Consultants (ETC) has the mandate from Swiss Federal Office of Transport (FOT) to measure the quality of public transport vehicles (bus, train, tram etc.) in all Cantons of Switzerland. Required sample quantities, that is, how many times the quality of a certain transport line or station must be measured (and evaluated) within a year is provided by FOT, as formal quality measurement requirements.

The test customers recruited by ETC make about 6-hour long round trips on various public transport lines, in order to measure more than 60 attributes of each transport line and vehicle. Each measurement is recorded into a special application on their mobile phones, which finally transmit the measurement data to a central web based database named QDABA that is located in Berlin. These measurement data are then used for generating various quality reports for the FOT, for the Cantons of Switzerland, and for the transport companies.

The complex round-trips and measurement details of these test customers must be planned meticulously in order to fulfill the quality measurement criteria required by FOT. Currently, this planning is done in a dedicated MS Access tool with some semi-automated consistency checks in order to simplify the manual work.

As the first step (named tour planning), tours are planned including details like start time and station, lines and stations to be taken and measured during the tours, cross-overs to nearby stations at certain stations, and so on. As the second step (named assignment planning), the planned tours are allocated to available days and test customers. Continue reading

  1. ETC Transport Consultants GmbH with offices in Berlin and Switzerland offers IT-solutions to transport authorities and transport companies as well as mystery survey for public transport systems. In order to provide fast and reliable data and information flows within the transportation sector, we offer dynamic and company-wide central data hubs (RBL / ITCS). These combine and connect individual components, such as traveler information systems, travel-time analysis, transfer connections and data management in one system. The integration of VDV and other interfaces is enabled. []
Posted in Calculation engine | Tagged , , | Leave a comment

A Simple Notation for Data Tables

We use the simple notation introduced here for data tables in order to formulate:

  1. Information and reporting requirements
  2. Analytical operations on data tables

This table notation can generally be used for requirement engineering. I will use the same notation in the upcoming article about Information Content of Data Tables, a concept that I had developed about eight years ago.

Let’s begin with a simple data table with two key figures (numbers) and three attributes:
Notation for Data Tables
Note that, independent of the storage or processing technology (database, excel, XML file) every data table is conceptually a mathematical information unit.

As an example, that you need tables T1, T2, and T3 for a report means, you need the information content (table data + metadata) of all these tables in order to generate this report (by applying some operations on these tables). That is, the information content (IC) of the requested report must be covered by the aggregated information content of these three tables:

IC (T1,T2,T3) ⊇ IC (report) ⇔ report can be generated with f(T1,T2,T3,metadata)

.. where f is any set of operations like filtering, aggregation, combination etc. to be applied on tables.

I will elaborate more about the information content of data tables in an upcoming article. Let’s continue with the table notation. Continue reading

Posted in Calculation engine | Tagged , , | Leave a comment