Blue Flower

Table of Contents

Digital Currency

https://file.army/i/B4fG3jK
Thanks to the development of information and communication technology, currency can be made digital and online. Digital means that currency can be implemented on computers where processes can be automatic and fast. Online means that currency can be used on the Internet which allows transactions and other usages from anywhere in the world at anytime leisurely. Today, digital currency is used for online shops and online payments. For example, I pay my electricity, water, phone, internet, and other bills from my home using a smartphone that is connected to the Internet. I also can order items, foods, and other stuffs from my home. Digital currency can also be used to conveniently participate in the global financial system such as loans, investments, and installments.
https://file.army/i/B4fGwrp

Due to the title of this book, you probably want to ask, what is the relationship between digital currency and cryptocurrency? Cryptocurrency is just one of digital currency. Digital currency is the general category. From my knowledge, I know the following digital currencies:

  • Digital Fiat Currency: currency backed but centrally controlled by the government made digital. While for paper cashes the government needs printers to print and coins needs to be smithered, digital fiat currency can be created at a press of button on the government's server. While cashes and coins are distributed physically, digital fiat currency can be distributed digitally and online to the citizen's application. Since this digital currency is centralized, the government have complete control over. Other than being able to print and distribute, the reverse scenario is also possible which are deleting, taking back, and even denying citizen's of financial service. Example digital fiat currencies are digital dollar, digital euro, digital pound, digital yen, digital yuan, and digital rupiah.
  • Digital Private Currency: currency made and controlled by an individual or group. These currencies today are mostly scene in companies such as Amazon points, cupons, and tokens.
  • Cryptocurrency: currency based on crytographic technology which allows the possibility of distortion and manipulation resistance, distribution and decentralization, censorship resistance and unconfiscatibility, privacy, and openess.

Bitcoin The First Cryptocurrency

The easiest start to understand cryptocurrency is to understand Bitcoin. Bitcoin is the first cryptocurrency created by Satoshi Nakamoto in 2008. The ideal concept of Bitcoin is to have the properties of openess, borderless, censorship resistance, unconfiscatibility, distortion resistance, manipulation resistance, distribution, decentralization, and pseudo anonymous. Disclaimer that the mentioned concept is idealized and in reality may not be perfect. Also in this book contains no technical explanation about cryptography and other technologies behind cryptocurrency because this book is inteded for users only. Instead, only illustrations or parables are provided and may not be fully accurate.

Blockchain

https://file.army/i/B4fGkvD
While in Bitcoin that blockchain is like a ledger that contains all previous trasactions secured from distortion or transmutability, as a user you can think of it as a history that is secured with a technology that can easily detect if there are slight changes or modifications with the purpose of preventing them. Unless you are a corrupted authoritive figure, you should agree the history should remain truthful as it is and should not be distorted. Blockchain is such a technology that prevents distortion of history where in this case preventing corruption of financial transaction.

Maybe you have heard about the latest Wirecard issue where one billion dollars was lost, or previous scandals such as seven billion dollars accounting error by Worldcom, Enron's hidden debt, or even the Charles Ponzi scheme back in the old days. If not, most probably you have heard of financial corruptions happening in your country or local area. I still remembered my teen days in Indonesia where there was Century Bank scandal. Nobody knows where the money went and the rich people who put their money their lost all their savings. I remember seeing the news that once a rich woman must know work as a laborer at any construction sites. Who knows if there were any that have to work as a maid or as a slave after losing their savings. I firmly believe in the absolute energy theory where energy does not disappear but transferred out. Obviously, the money did not disappear, someone must have taken that money. With blockchain technology, transactions can be securely recorded in details, preventing these kinds of scandals.

Distributed System

https://file.army/i/B4fGmwQ
Blockchain is a good way to record history due to the mechanism that a slight change can be easily detected but unfortunately, blockchain alone is useless. If there is only one copy of blockchain in one server, how to know modifications if there are no reference? Well, you can just have backup of the blockchain or record the modification that occur but do you really trust the entity to be honest? They can just corrupt the copy and the modification record. This is where distributed system comes in. Blockchain does not only exist on one server but many servers that are distributed and able to verify each blockchains.

Decentralization

https://file.army/i/B4fGoqY
After inventing the blockchain and implementing distributed system, Satoshi Nakamoto invented proof of work which is some sort of mathematical algorithm to emulate working in computers as people sweat when working. By combining blockchain, distributes system, and proof of work invented decentralization where any computer nodes can process transactions and other nodes can verify. While the current financial system requires intermediaries to build trust in transactions, Bitcoin does not require intermediaries as the algorithm allows every node to reach a concensus. While the current financial system is centralized where decisions are made by the top authorities, Bitcoin is decentralized working solely on algorithm, meaning that no entity can control.

Mining

https://file.army/i/B4fGsea
The proof of work is like a competition to solve mathematical problems which are too technical to be included in this book. If you are curious, read the 2008 white paper by Satoshi Nakamoto or find other sources. Those who works are rewarded with Bitcoin which is why the process is illustrateable to mining. This is also how Bitcoin is generated.

Open and Transparent

https://file.army/i/B4fGJCL
The whitepaper is open for anyone to read currently at https://bitcoin.org/bitcoin.pdf.The source code is open for anyone to use currently on https://github.com/bitcoin/bitcoin. It is even possible to fork the code and start a new coin. The Bitcoin network is open for anyone to participate whether as a user, a node, and/or a miner and transparent for anyone to see. All transactions are transparent, who mines and received Bitcoin supply are known, and the total supply of Bitcoin in existence is also known and determined which is 21 million and will be never less or never more. Therefore, Bitcoin is neither deflationary nor inflationary overall.

Unconfiscatible

https://file.army/i/B4fGaD9
Your Bitcoin are safely stored on the network and nobody except for those who knows the keys can access and ideally, only you should know the keys. By keeping the keys safe to yourself, no even the highest authorities can confiscate your Bitcoin. This surprisingly works well in oppressive regions where the governments who were suppose to serve the people grew corrupted in power. The private keys used to be a large complicated set of characters but now there are keys in form of seed phrases. This is also related to borderless, where you only need to remember the seed phrases in your head and you can travel to anywhere in the world without the needs to carry your funds physically and most of the time without anyone knowing. It is like keeping the funds in your brain. However, there is also a risk where if you lose the keys, you lose access to your funds forever. Therefore, self responsibility is also raised to the max.

Borderless

https://file.army/i/B4fGnYo
Bitcoin uses peer-to-peer (P2P) network where for as long as there is a peer nearby, anyone can connect to the network to perform transactions or other activities. The Internet is included in this network where for as long as you have a computer device connected to the Internet, you can use Bitcoin. If you are unfortunate to not have Internet, there is still an opportunity to any broadcasting wave for example use radio to broadcast your transaction. Anyway, with the world today from almost anywhere in the world you can access Bitcoin meaning that you can access your funds.

Uncensorible

Again Bitcoin is P2P where for as long as there is a peer node nearby, you can connect to the network even if Internet is censored. Authorities can always try blocking every nodes but good luck in blocking emerging nodes daily. If you have used any Bitcoin wallet, you probably wondered why they give many warnings of not to make mistake in inputing the receiver's address. That is because the transaction is irreversible not only to prevent distortion and manipulation but to prevent censorship. If a transaction is reversible, authorities can easily demand to reverse your transaction if they do not like it.

Pseudoanonymous

https://file.army/i/B4fGDrU
Bitcoin is pseudoanonymous where the anonymity depends on how you use the coins. You can choose to be identified from the start but if you choose to be anonymous, you need to know how to use the coins correctly where the basic is just not to send your coins to any address that may expose your identity. If you have to do so, then create a new address and get new coins from another source.

Bitcoin to Other Coins

Bitcoin maximalist may say that other than Bitcoin are scam coins, only Bitcoin is the truth, but in my opinion, that thinking will only cover one of the beauty of Bitcoin. The beauty of Bitcoin is that it is open source where anyone can reuse the code and modify. If anyone wants to build something different or just does not agree with some function of Bitcoin, then they can freely create another coin and take a different path instead of fighting to change Bitcoin which isn't that the same as war?. Then let the people choose which coins they prefer. The freedom to choose is one of the beautiful contribution of Bitcoin.

https://file.army/i/B4fGcj3
For remmittance, I do not use Bitcoin because currently is at least $5 more expensive. Five dollars are maybe a small amount for people in developed countries but for us in developing countries is the cost of 4-7 meals. Bitcoin maybe the cheapest option in developed countries but in places that I went to which is remmittance from Japan to Indonesia, Brastel Remit which is non-crypto is cheaper. For my case, using Litecoin is the cheapest. Although there are BNB and XRP which are cheaper, the exchanges that I used in Japan does not have them or offer only expensive rate. If your argument is that because I am using fiat, that is an irresponsible argument at least in this time because almost all of us earn in fiat. If your argument is that because I measure in fiat, I will answer because almost all items are measured in fiat for example, would you spend $5 in transaction fee to buy $1 of coffee? See Chapter 3 for more details. There is already a layer 2 solution which one of them is the lightning network but we are still waiting for the technology to develop and for merchants and exchanges to implement.

Alternative Coins

https://file.army/i/B4fGN6Z
You probably have heard that there are thousands of alternative coins (altcoins) out there and are growing to millions. With many coins out there, how do we study all of them? Well there is always a way of studying them one by one but as a starting user, the simplest way is to study Bitcoin first because most of these coins have the properties of Bitcoin but with few or many twists. Here are some examples:
  • Ethereum: the first coin for decentralized application and services which includes decentralized exchange (DEX) and decentralized finance (DEFI). I heard that the story behind this that one of the founder Vitalik Buterin wanted to build these on Bitcoin but the Bitcoin developers disagree because it will spam the network and other reason. The Bitcoin developers stated that they only want Bitcoin for transaction only. This is way Vitalik Buterin went different path and created Ethereum instead.
  • Monero: the first coin that focuses on privacy and anonymity. Bitcoin is pseudoanonymous where if you send to certain addresses may expose your identity. Monero uses an algorithm to make the transaction untraceable ideally. Monero also resist application-specific integrated circute (ASIC) and graphic processing unit (GPU) mining so that regular people can have a chance to mine using their computer processing unit (CPU).
  • Stable coins: coins that are pegged to certain values usually by having the assets in reserve. Example fiat pegged coins are: USDT, USDC, and TUSD are coins pegged to dollar where one coin is worth one dollar, EURS is a coin that is pegged to Euro, BKRW is a coin pegged to Korean won, bitCNY pegged to the Chinese yuan, BRZ pegged to the Brazillian Real, and IDRT, IDK, BIDR pegged to the Indonesian rupiah. There are also coins such as DAI that are not pegged but uses algorithm to adjust themselves to certain values. An interesting type of stable coin that emerges today are commodity stable coins such as XAUT and PAXG that are pegged to physical gold.
  • Content coins: coins where the blockchain is specified for sharing social contents. The early projects are Steem and Hive where they build a decentralized blogging platform. The new ones today on this writing are LBRY and BitTube that are well known for focusing on multimedia contents. Recently new ones emerged which are Blurt and Revain
  • Various mining algorithm: Bitcoin uses SHA256 proof of work (PoW). There are many other coins using different proof of work algorithms such as Equihash used in Ethereum, Scrypt used in Litecoin, Cryptonights which favors GPU and some CPU, and Yescrypts which favor CPU. Other than proof of work, there are proof of stake (PoS) which the algorithm is based on the amount of coins locked where Peercoin is the first one to implement and Ethereum is said to migrate to PoS in the end of 2020. The other one I know is proof of capacity where it uses the amount of hard drive storage. Many more algorithms are emerging.
  • Utility coins: the properties of these coins are usually the same to many coins but they offer special services to their platforms. For example exchange coins such as CRO, BNB, HNST, AWC, TWT, KCS, and TOKOK gives you discounts if you use those coins. If you use MCO (deprecated and now just CROTernio, you can get their crypto powered debit or credit cards. You need BitTube as a payment to buy more storage on their platform.
  • Fan coins: for example DOGE is initially created as a joke currency where the label uses a cute Japanese dog called Shiba Inu (Hachiko if you know) which is intended for a fun community. If you like sports then you probably should take a look at CHZ.

There are many other coins like there are many companies out there where you need a whole team to research them all. You can legitimately get rich buy investing into altcoin because the concept is the same that you invest in good things before anybody knows. For example I bought $70 worth Statera when I saw there post and read that they are a deflationary coin on defi and when I saw that price was still steady, I estimate that they are still early and finally my Statera once worth over $200 and sold $70 to return my capital and now I'm in profit. Also altcoins are the most dangerous investment I know because new stuffs have a high risk of not surviving for example I bought almost $100 of Inmax as a random gamble and for years they have not launch their exchange and $100 plummelled to $1 meaning that I completely lost the gamble. Also, beware of scams that anyone today can make their tokens, they could even name them Bitcoin for example if you buy these named Bitcoin token it cannot be used on the Bitcoin network because they are not the same. Therefore, always do your research first.

Government and Other Private Blockchain

The governments, banks, and companies said they are interested in implementing blockchain. You often heard they said yes to blockchain but no to Bitcoin or blockchain has value but Bitcoin does not have. What do they mean? They like the blockchain and the distributed system but they do not like the decentralization, openess, censorship resistance, unconfiscatibility, and privacy. They are in control of the global financial system, implementing Bitcoin means the same thing as giving up control such as their ability to print currency, their ability to distribute to whomever they want, and their ability to enforce monetary policies.

https://file.army/i/B4fGVwq
In summary, they want to develop a currency that uses blockchain and distributed system but centralized, censored, closed, and controlled. While the general cryptocurrency are open for anyone to participate, government blockchain currency development are closed to the government only such as the distributed nodes are only governmental nodes and if there is mining then only the government nodes can mine which means that only government nodes are allow to process and verify transactions. I am not sure what their motives are but in my opinion probably to increase the security when they implement digital currency and to prevent internal corruption but still have the power to manipulate the blockchain if they deemed necessary and to have monetary control over their citizens such as who can use, who cannot, what services are allowed, and freeze or even forcefully take the citizens' balance when necessary. What about company blockchain? Like Facebook Libra, while government blockchain are closed to the government, company blockchain are closed to the company where the company decides who can enter the space and have control.

Correlation to Our Lives

https://file.army/i/B4fGgqF
Now that you know how amazing Bitcoin and other cryptocurrencies are but unless you are hobbyist, follower of economic news, or have faced financial turmoil before, you probably asked, what does cryptocurrency have to do with our lives? In rough questions are so what, what about it, and then what? You maybe living very well right now. You have a large amount of cash in your pocket and balance in your bank account and you can buy what you need or even what you want and now, you are even fasinated that you can swipe your debit and credit card anywhere for payments or you can purchase online. Then, why do you need cryptocurrency? For starters, there are people out there that are not as fortunate as you such as people who have their local currency value destroyed, denied access to banking or banking services are just not available, have banking but currently restricted due to oppressive authorities, or in anyway denied participation in the global financial system. Mainly cryptocurrency are for these people. Other than that, cryptocurency are for developers, visionaries, supporters, opportunist, educators, tech geeks, speculators, and more. To truly understand cryptocurrency, it is necessary to know about the previous and current financial system.

Financial Value

https://file.army/i/B4fGtxe

Most of us are probably born with fiat currency or public simple term cash accepted as money which is a tool to communicate value. Simply with money you can buy anything and most of us believe that money is our primary necessity which is not true. Money is just a tool, it is what we can get with money is our true necessity. If you cannot understand that, then you lack history lesson or logic. Ask yourself a question, does money always exist in the past? The answer is no. If you go back in time and give people dollars, they would think you are crazy. Why would they give you stuffs for a piece of paper?

The oldest form of trading is barter. I need water and you need food so I trade some of my food with you to get some water. However, barter have scaling, practibility, and divisibility problem. I have food but not everybody needs much food, you have clean water but not everybody need much water, and someone have clothes but not everybody need much clothes. I need clothes and found someone who has that also needs food. I have to negotiate how much food to give and how much clothes that person will give. Very impracticle and people began to demand a single unit that can measure the value of every item or an item called money that can buy anything.

People began experimenting with salt, sugar, crops, shells, and other commodities as currencies but only one type was admitted through out history and that is precious metals. Mainly gold and silver have the property of immutibility which means no one today can create gold, you have to mine gold. This means today that gold is scarce which is known to have limited supply. The property of gold also cannot deteriorate which the form of gold you have now will remain the same almost forever which indicates a good commodity to store. Gold is divisible where items can be valued in weight of gold for example a meal is worth a few miligrams (mg)s of gold. People began to create gold coins that makes trading much more practical then before.

In my opinion, for average people, gold was doing well as a currency but gold was not practical enough to be used on nation scale for example it is very heavy to carry for massive transaction not forgetting to mention costly as well and risk of being raided or anything that can lose the physical golds. Dividing gold is still not easy for regular people where you need smithing which means that there is a limit to the divisibility. Say that I carry a few mg of gold but I only want to buy one candy, usually I cannot but have to purchase many candies or other items.

This is where paper money comes in. Instead of carrying heavy gold, we trust banks to store our gold and receive a certificate or a kupon where each of them represents an amount of gold. That is the good dollar I knew, where each dollar can be exchanged to fixed certain amount of gold. Paper money are easier to carry and easily divisible and vice versa. It is also practical enough to be used as a medium of exchange on a nation scale. Then comes banking, the digital age, and online transactions and you know the rest.

https://file.army/i/B4fGHC5
Everything is well when the government and their partners are righteous but what if they are not? In 1971, Nixon ends the Breton Woods system by taking the Dollar of the gold standard. This means that the Federal Reserve will no longer exchange the fix amount of gold to the Dollar. The deeper meaning is that they can print as much Dollar as they want whilst previously they need the gold to backup the printing. At least they should have some goods to backup the Dollar but they do not and print out of thin air. Now, they have the capability to do quantitaive easing. Although the intention is noble which is to stimulate economic growth, the reality is indirectly forcing you to fund their projects. You should agree that donation cannot be forced nor without consent.
https://file.army/i/B4fGPDA
Most of us in the open crypto world are sarcastic of banks at that with similar statement of why should the banks be afraid of taking risks if there is the government to back them up? If they win huge profit in their investment then most of the profit is theirs but if they lose then the government will save them by printing fiat currency in giving it to them and do you even know what that means? That means devaluing the cash that you hold which is the same as giving pieces of the cash to them. Logically, is not the government who bailed out the banks but you the citizens who bailed them out except for those who sold their cash and bought physical goods. Think of it like this, you worked hard in a gold mine to get physical gold and then there are many alchemists who can create a lot more gold very easily from nothing. Usually the amount you mined is enough to get a decent amount of food but these alchemists also needs food. These alchemists offers a lot more gold for the same amount of food that you demand. Now, who will the food seller sells the food too? Ofcourse the alchemists, now it is the norm that food cost a lot amount of gold, how much food can you get? Almost nothing, all that hard work in the gold mine suddenly becomes useless since the birth of many gold alchemists. This is what you call inflation. The Dollar and other fiat currency are now the same, where the federal reserves, central banks, or government are now the alchemist. They can create cash as much as they want give them to whomever they want.
https://file.army/i/B4fGiz4
In the banking crisis of 2008, they perform alchemy and transmute papers into Dollars and give them to the banks. Bitcoin might not have been invented if they bail out the citizens instead of banks or the least just let everything collapse, accept their mistakes, and build a new better system. If you see the message in Bitcoin's genesis block which is the very first transaction, it states discontent of the second bail out. The exact message is "The Times 03/Jan/2009 Chancellor on brink of second bailout for banks". In my opinion, Bitcoin was a protest for the government's irresponsible control of the fiat currency so Bitcoin was invented as a currency that no one can control or politisize.

Before proceeding, let us take a look at some other currency debasement history. Most of the information I got from Guide To Investing in Gold & Silver by Michael Maloney and I strongly recommend watching his Hidden Secrets of Money Episode.

https://file.army/i/B4fGjhn
The fall of the Roman Empire in this perspective is caused by the overfunding of conquering other regions where the government ran out of funds. They started by debasing their coins by embedding less gold and silver and ended by forcing the merchants to lower their goods' and services' price. Naturally, the economy fell. Before the Roman Empire, there was Athen where they also find a clever to overfund the war by mixing 50% copper into their gold and silver coins and perform deficit spending.
https://file.army/i/B4fGys1
It is known that the first paper fiat currency emerges in the 1370 during the Tang Dynasty in China. Merchants who does not accept the paper currency receives the Death penalty. Not long afterwards, it went into hyperinflation.
https://file.army/i/B4fL9y7
Around 1716 France was in great debt that even the taxes could not cover the interest. John Law proposed paper currency and economy prosper and everyone lived in great wealth. However in the end, the system began to collapse when a royal exit the system by redeeming their paper currency to gold. Then everyone started to follow, if there were no gold then silver were given, and if there were no silver then copper were given until everything collapse. The paper currency did save the economy at one time but they blew it up by irresponsible spending and more printing, or they are just living in delusion that they are using imaginary assets or in my opinion future spending power, or the reason they succeeded in the begining was quitely and indirectly they were using the wealth of the citizens by printing more currency which is actually the same as increasing the taxes many times fold but the citizens were ignorant of the truth.
https://file.army/i/B4fL76s
At the beginning of World War 1, Germany conducted heavy currency printing out of thin air but inflation was slow to follow. This was because of dire times that the civilians were conservative therefore, less currency in circulation, and they believed in the currencies purchasing power so they save them meaning higher demand for currency and less demand for goods and services. When times got better, the demand were reversed. Everybody wanted to spend and in no time hyperinflation followed.
https://file.army/i/B4fLT3l
After the 1900s, the Dollar and similar fiat currency went through similar events which are overfunding of wars. In the end, they exit the Breton Woods system by cutting the Dollar off the gold standard. Today they can print as much fiat currency as they want. As the previous explanation about the 2008, the authorities decides who should be rich and who should be poor like a deity deciding who lives and who dies manipulating the natural law of the free market which was no doubt unfair.
https://file.army/i/B4fLYqj
Even today, Zimbabwe is still a well known case of hyperinflation where they print their fiat to oblivion that you can even find markets selling mountains of cash as souvenir. Today, Venezuela is in the lead for hyperinflation.
https://file.army/i/B4fL4xk
Bitcoin and other cryptocurrency can be a solution because their supplies are transparent and more fair than the current fiat currency. However, there are other solutions for this issue. The oldest form of money that have survived for centuries are precious metals such as gold and silver. If your goal is only to preserve your wealth, then buying gold and silver have proven to be a great solution historically. Other than precious metals, you can go back to the basics of money and ask the question of why you need money in the first place, and that answer is because you want to buy goods and services. Simple examples are stocking food supplies, have shelters which include houses, and buy items that you need and items that you think will be useful in the future. To go further, start being self sustainable such as grow your own fruits and vegetables, build a farm, secure your own water supply, or even build your own electricity source such as using solar panels, wind turbines, and other renewable energy sources. Once you become smarter, you probably want to start your own business instead of just holding those cashes.

Financial Freedom

This book does not emphasize cryptocurrency as a solution to financial value. You can find gold bugs in agreement with Bitcoin activists regarding the problem with the current financial system but does not agree with Bitcoin especially other cryptocurrencies mainly because they do not have physical form and many other reasons. Like this book stated previously that if the problem is only financial value, there are other solutions which proof historically effective. Eventhough these years Bitcoin and some cryptocurrencies have the best performance, regular people cannot handle the short term volatility. However, before the purpose of tackling financial value, Bitcoin and other cryptocurrencies are created for a larger purpose.

https://file.army/i/B4fL8pv
In my opinion, on the creation of Bitcoin in 2008, the value loss of fiat currency was not the primary problem. The primary problem is the global financial system heavily controlled and maybe politisize. Remember that the current local currency that we hold are no longer backed by any value and the central banks along with the government can print as much as they want, and remember that that kind of printing is taking value from you and giving them to others. They can decide who gets wealthy and who gets poor. I am aware that I am bashing the authorities one sidedly. If I were in their shoes, what would have I done differently? I do not know but this book is mainly for you users and citizens. I believe to put my friends, families, and myself higher priority than the nation, it is the default nature. Wise people immediately escape to precious metals and other assets but do you know that often when a country's financial system is about to collapse, they implement strict monetary policies. This book is about how to save yourself when they abuse the fiat system and not about whether their decisions are right or not. In fact, I do not really care about debasement, printing out of thin air, counterfeiting, inflation etc but I care about the freedom to utilize and manage our own wealth however we want or to join and exit the system whenever we want which is why I am in cryptocurrency. The previous subsection discussed only the intial phase of falling economy which is debasement. When debasement starts to fail, the following usually happens which happened in all the debasement examples of the previous subsection:
https://file.army/i/B4fLKDH
They may demonetize bank notes. Although theoretically there should not be any changes but in reality may indirectly affect the citizens. For example in India 2016, there were shortage of cash, long bank queues, and short deadline where there were citizens who did not have the opportunity to exchange their bank notes. This means they are now holding useless paper cash which means they got their wealth stolen indirectly.
https://file.army/i/B4fLezf
They will start to ban exchanging and trading assets other than their fiat currency, for example they banned transactions in gold and other foreign currencies such as the Dollar. Other examples that are happening during the writing of this book are China discouraging its people from buying precious metals, FDIC discourage you from withdrawing cash from the bank, the global stock market closes temporarily during the black swan event due to COVID-19, and more severe example is Zimbabwe permanently shutdown its local stock market accusing it to be responsible for the collapse of their fiat currency once again and also they block all electronic and mobile payments.
https://file.army/i/B4fLx1I
They implement stricter travel rule for example rumors of China not allowing its citizens to bring their wealth outside of its country without the government's permission.
https://file.army/i/B4fLIsV
When it gets worse, they force exchange your assets into their fiat currency such as the Executive Order 6102 that confiscates citizens' gold.
https://file.army/i/B4fLSyp
Even worse is the Corralito Affect in Argentina where they frozed the citizens bank account forcibly covert 1 Dollar to 1 Peso which then they released the peg and immediately 1 Dollar became 4 Peso and got more expensive. The worst case is that they truly confiscate or seize your asset like how Cyprus in 2012 seize bank deposits where seizing only normally happens to criminals.

The last thing that they will try to do is to force control the price of the market to their fiat currency. This happened a few years later in the United States after President Nixon no longer supports the convertibility of the Dollar to gold. The worst was near the end of Roman Empire where they released a law that citizens are forced to work and continue family business but at a controlled price punishable with Death.

With Bitcoin and other cryptocurrency, regulatorily can be banned but technically cannot be stopped or censored, technically cannot be confiscated where the only way is to persuade, pressure, or social engineer the owners to hand over themselves, both the supply function and distribution are algorithmicly and mathematically defined which ideally is neutral and not controlled by any single entity, and also most are open and transparent.

Mirrors

Table of Contents

Enter With a Wallet

https://file.army/i/B4FOjIH
You may have been regularly following the news and you may have been diligently following the price. A funny story from a student of Ivan on Tech Academy, that student learned how to program smart contracts and other cryptocurrency related technology but do not know how to buy coins, receive coins, send coins, and use coins. This chapter will guide you of how to enter cryptocurrency space.
https://file.army/i/B4FOy0f
To enter the cryptocurrency space, you need a medium that can receive, send, and perform other interactions with those coins. A simple medium is called a cryptocurrency wallet which you can carry to anywhere in the world. The first category division of a wallet is divided into three either custodial, semi-custodial or non-custodial.

Custodial Wallet

https://file.army/i/B4Fl95I
A custodial wallet is basically cryptocurrency served to be as similar as the current banking system. You do not want to hold the responsibility of securing your own coins. Therefore, you store your coins into a cryptocurrency bank and trust the bank to secure your coins. Another reason why starters prefer custodial wallets is because of the support. When the user have problems with the interface, exchanging, depositing, withdrawing, and even when they lose their password, the support can help them. When anything goes wrong, the users can even blame the support and may even get refunds. A good custodial cryptocurrency bank for starter is Coinbase. Not only they try to make it as simple as possible for you but educate you as well so that one day you are ready to handle non-custodial wallets.
https://file.army/i/B4Fl74V
One user feature that remains the same in custodial wallet is the users' perspectives transaction mechanism or the send and receive coin function. The transaction process remains irreversible where if you input the wrong address even with the smallest typo, you will lose those coins forever. That applies when to transactions between the custodial wallet and the outside wallet while for internal custodial wallet, they can fix your mistake because it is internally controlled. In the past, custodial wallets always gives warning about to be careful when typing an address before sending or receiving while non-custodial wallets does not give warnings at all where starters are often lose their funds because they are not careful when typing addresses. Some custodial wallet have an advance feature of detecting address typos.

Semi-Custodial Wallet

https://file.army/i/B4FlTMp
A semi-custodial wallet is wallet where both you and the third party have the keys to your funds. The purpose is to ensure that you can recover your keys whenever you lose them while at the same time trusting the third party to secure your keys and not be irresponsible with your funds. For now, I only know one type and that is a web wallet where you open a website which is a wallet service where you can register, input your username and password to use the wallet but they also let you have the keys. However, the semi-custodial that I described I cannot find or have not existed yet. The current semi-custodial wallet method that I know is where the third party stores only the encrypted keys and needs your password to decrypt them. They do not store your password where if you forget your password, the keys are the only way to restore your funds. They called this, hybrid wallet. There are also non-custodial web wallets where they just provide the interface to access your wallet sparing the necessity of installing a wallet application on your computer device. Usually, you need to import your keys everytime you open the web wallet to access your funds.

Non-Custodial Wallet

https://file.army/i/B4FlYaK
Non-custodial wallet is the original concept of cryptocurrency. While custodial wallet keeps your funds in the custody of third party, non-custodial wallet returns full authority and ownership to you. This concept was introduced because people are losing trust with third party where they act irresponsible to their customers' funds. This concept is also for those who are in oppressive region where the authorities becomes tyrants, telling you what to do with your assets. With non-custodial wallet, the concept is no one other than you can access your funds that includes even the highest authority. Your funds are unconfiscateable and they have to torture you to get them. However, the responsibility is also to the max where you are solely responsible. If you lose your keys then you lose access to your funds forever or if you do not secure your keys well, someone can steal them and steal your funds.

Hot Wallet

https://file.army/i/B4Fl49D
People often call this hot wallet which is a wallet that requires Internet connection to function. Well you can think of this as an online wallet although hot wallet is more appropriate because the keys are not stored online but should be safe with you and online usually refers to anything that is accessible on the world wide web (www) which is rarely the case for hot wallet. Yes, there are wallets where you can access online where the keyword is web wallet but most of the hot wallets are downloadable applications on your computer devices. For starters, I very recommend to try hot wallet first because it is user friendly and the reason why they require Internet connection is so that they can access, manage, and sort your funds faster. Also, I recommend Exodus wallet because it has a good user interface and support many coins and below is an example of using Exodus wallet:

Backup and Secure Seed Phrase and/or Private Keys

https://file.army/i/B4Fl8OQ
The first thing that you should do after creating a wallet is to backup and secure your seed phrase or private keys. The reason why you should backup them because if you lose them, you will lose access to your coins forever. The reason why you should secure them because if anyone knows them, they can steal your coins anytime. However once again from the previous chapter, the advantage is that you fully control your coins that no one even the authorities can confiscate as long as no one knows your seed phrase or private keys. To get your seed phrase in Exodus Wallet, find the backup menu. In the new version, you go to settings and choose the backup navigation menu.
https://file.army/i/B4FlKkY
Usually you get few words, so write them on the paper or print them out and put them on your locker or vault or anywhere safe. You can also store them on your computer but beware of hacks, malwares, and virus that can steal that seed phrase. The best case is memorizing them and never forget because the brain is currently the best place to secure things. Warning, I showed the phrase here only for educational purpose and anyone can steal the fund in this wallet because I showed it in public. In fact, I once tried putting some coins and immediately they were sent to an unknown hacker's address. Once your seed phrases or keys are leaked, immediately create another wallet, move all the coins, backup and secure the seed phrase and/or private keys, and abandon the leaked wallet. In the last part of the video, I demonstrated how coins can be stolen if the keys are leaked.
https://file.army/i/B4FlRHa
If you need the private keys of each coin address rather than the whole seed phrase then go to home, then wallet, click the desired coin, go to top right menu, and export private key.
https://file.army/i/B4FlxSL
Once you got your seed phrase and/or private keys, you don't need to worry about you computer device being broken because you import the keys into other devices and access your coins.

Sending and Receiving Coins

https://file.army/i/B4FlIF9
The other basic function is the sending and receiving coin function. Go to the desired coin and find receive button or downward arrow icon for receiving coins or find send button or upward arrow icon for sending coins, and click. If your desired coin is not on the list then click the plus (+) button or go to settings and find assets to add your coin to the list. If you still cannot find it, then you need a different wallet, search on Google for a wallet of your desired coin.
https://file.army/i/B4FlS5o
For receiving, you can find your public address in form of complicated characters which are these characters that you need to share to receive a coin. Here, you are provided with a clipboard function for copying and pasting the address and a QR which you can choose one or use both them. Those functions are there to prevent human errors in sharing the address because one character mistake can result in coins to be lost forever since transactions are irreversible, also if you share the wrong coin address (for example the sender wants to send you Ethereum coins but you share your Bitcoin address) will also result in coins to be lost forever. Always double check, even triple check is recommended. There are many places to share your address such as your blog, social media, for donation, and you can print it out put in on your store to receive payments.
https://file.army/i/B4Flu8E
For sending coin, put the payee's public address on the address section. Again use copy&paste or QR code and double check to prevent human errors and phisings because transactions are irreversible where you should also carefully check the amount like not make a mistake like putting an extra zero. Usually, you are charged with a network fee. Once you checked again and again, then send.
https://file.army/i/B4FlvMU
A transaction hash (TxID) should be generated where you check the status of transaction on any blockchain explorer of the respected coin. How wallets today are usually user friendly enough that you can click the TxID which will redirect you to the transaction status. This TxID is enough to be used as a receipt since it contains the details of the transaction. If your wallet is not user friendly enough, you can search on a search engine for example on Google you search for "Bitcoin blockchain explorer" and choose an explorer and insert the TxID on the search box.

Built-in Exchange

https://file.army/i/B4Fl6a3
Some wallets have built-in exchange for you to exchange coins for example in this image shows exchanging BTC to USDT. This is a good function where very soon, all wallets will have built-in exchange.

Other Functions

The necessary functions of using a wallet have been discussed. Explore other function yourself. I may cover interesting functions in separate articles.

Other Hot Wallets Types

https://file.army/i/B4FlOBZ
The Exodus wallet demonstrated in this book is a stand alone application installed wallet. There are other more convenient wallet but unfortunately today, the more the convenience, the more the security risk. Therefore, I do not recommend to store a large amount of fund here. Only use these wallets for trials and fast transactions. Another type of wallet is an extension or plugin wallet for example this image shows that I installed Guarda Wallet on my browser. If you are a browsing maniac where you are too lazy to install anything, then you can try web wallet where you just need to type the web address for example Guarda also have a web wallet at https://guarda.com/web-wallet. This is convenient for those who used full browser operating system such as Chrome OS. Ofcourse, since these are still non-custodial wallets, you still keep the keys.

Cold Wallet

https://file.army/i/B4FllOq
A cold wallet is a wallet that does not require Internet connection but more accurately, is wallet that keeps your keys offline. By keeping your keys offline mitigates the risks from online malicious activities which should be safer that hot wallet. This does not only apply to keys but to any digital data that you want to keep safe. An example wallet is Electrum which is a cold storage wallet where you can open it offline but if you are online in default connects to the Internet.

Paper Wallet

https://file.army/i/B4FlGkF
As the title of this subsection, a paper wallet is a wallet made out of paper. It contains your public address to receive coins and your keys to send coins. Well, I stated that the public address can just be shown on the cashier to accept cryptocurrency payment but how do you use the keys that are on the paper? You can have a one time smartphone application for example that scans the QR code of the key each time you want to send coins and after successful transaction, the key information on your smartphone is destroyed meaning that the key will not be stored on the smartphone. It necessarily does not have to be a computer device connected to the Internet, but anyway to connect to the network and use the keys to send the coins.
https://file.army/i/B4FlLPe
You can always study the theory and perform mathematical calculation yourself to generate a public address and a private key but this book will not be 101 for users anymore if that is the case. You can start by search on the search engine such as Google and Presearch about "paper wallet generator". Extra, try to find more information about the website that you chose by reading them thoroughly, ask a trusted friend of yours, search whether they are scam or not, and try searching the social media.
https://file.army/i/B4Fl1S5
After you chose a website, for example here I chose https://www.bitaddress.org which is the oldest paper wallet generator with no issues up to now. You can always just visit the site and save page as but the domain can change ownership where there is no guarantee whether the owner is not a malicious. Therefore, I suggest to open the link to their source code https://github.com/pointbiz/bitaddress.org and download the .html file or the whole code there.
https://file.army/i/B4FlQN4
Turn off your Internet connection and open the .html file using a browser where I recommend you do it incognito. For starting users convenience, I demonstrate in Windows but if you want stronger security, I suggest to do it in fresh Ubuntu live or even better Tails Portable Operating System or anywhere that you think is free of malware.
https://file.army/i/B4FlA8n
Randomly hover, click, scroll the mouse, and type in the text box to generate a random address. The more random your action, the better. Choose paper wallet, and print. If you don't like the default print, then you can just save image as and redesign yourself. If you want a brain wallet instead where you want to generate using your own passphrase, there is a menu.

Hardware Wallet

https://file.army/i/B4FlCQ1
Let us be realistic, there will be few times in a lifetime that you will often take out your paper wallet and the more often you do that, the more riskier it is. The personal risk of paper wallet is the fragility where it can be damaged or lost. For example, getting rip in an accident and just getting wet a bit may damage the paper. Once it is lost, your coins are unaccessible ideally forever. So, you may rely on electronic storage media to reduce this risk. You can have a hard drive, usb drive, SD card, diskets, or even a whole computer device to store your keys but the important thing is to keep them clean and offline.
https://file.army/i/B4Flpb7
Though you can always store your keys or just install the wallets on a USB (universal serial bus) drive, today there are smarter USB devices specifically for cryptocurrency storage called hardware wallet. The first commercial hardware wallet is Ledger Nano but you can get a cheaper one as cheap as $50 called Trezor which I decided to have as my first hardware wallet. While paper wallet is very long term storage, hardware wallet is still practical enough to be used for frequent transaction but still I recommend using hot wallets for high frequency trading.
https://file.army/i/B4FlFBs
After turning on your hardware wallet, you can import existing wallets using your seed phrase. If this is your first hardware wallet, I recommend you to make a new one and dedicate as another place to safely hold your coins. Backup and secure the keys and set a lock pin. Then connect the hardware wallet to the computer.
https://file.army/i/B4Fldll
For example using Trezor, install the bridge driver if necessary. Currently, you can manage your wallet by opening a browser with and visit https://wallet.trezor.io/. I'm not sure whether you can save page as and open it offline but they provide a manual if you want to install your webserver to access your wallet online or locally at https://wiki.trezor.io/User_manual. Trezor also plans to integrate with other application, do check, for example it is available for integration in Exodus and Binance DEX

Getting Your First Coin

Now that you have medium to receive your coins whether through custodial wallet, non-custodial wallet, hot wallet, cold wallet, or multiple of them, you are ready to fill them with coins. There are only three ways to get coins which are transacting with someone, mining, and creating one by becoming a developer who invent your own coin. In this section is recommended for new users get coins directly from someone. If you want to get them using your bank account, skip to the next chapter, if you want to mine, skip to later chapter, and if you want to become a developer skip to the next book. Finally the last message in this section is to get yourself some cryptocurrency in order to participate in the ecosystem because without them, your options are limited. Start with something small or more accurately, an amount you are willing to lose or an amount you are comfortable with because if you go in big out of greed expecting to get rich quick, your mentality may not be able to handle it because the market is very volatile where even if in the future the price will sky rocket but it may drop more than half first before that happens.

Coin ATMs

https://file.army/i/B4Flfkj
Getting coins from coin automated teller machines (ATM)s in popular areas is my first recommendation because there is no way that these machines escape the eyes of local regulators where you can almost guarantee their safety and security. Go to the search engine and type for example "crypto atm", "bitcoin atm", etc. I found a website called https://coinatmradar.com/ where you can check whether there are crypto ATMS near you. Bring some cash though I'm not sure whether you can use your bank card or not, and bring your public or receiving coin address. If this is your first time in cryptocurrency, do not forget that there are more coins other than Bitcoin and make sure you use the correct public address.

Buy From Someone Trustful

https://file.army/i/B4FlwPk
The next best option for starters to get cryptocurrency is to buy from people you trust. People in your circle should be your primary option. Ask your family and friends about cryptocurrency and ask to them to guide you to buy some. With information technologies nowadays, you have more options than contacting them one by one, for example you can post a status or broadcast a message in social media platform. However, only recommended with people you trust because you are new that anyone can easily scam you like selling you fake Bitcoins.
https://file.army/i/B4FlWZv
After you are knowledgeable enough like being able to identify fake coins, not sending money to people you cannot meet, and know the correct price in the global market, then I am confident in recommending you find other people. If you can find cryptocurrency community near you, that is great. If not, then there is https://localbitcoins.com/ that can help you lead to people near you who have Bitcoin.

Non-Custodial Credit Card Service

https://file.army/i/B4FlkFH
If you have a credit card, then there are convenient online non-custodial service you can use such as Simplex. All you need to do is give them your wallet public or receiving address. However, not all credit cards work for some reason and there are many people who does not have credit card.

Mirrors

Author

Fajar Purnama

Note

  • This is a thesis submitted to Graduate School of Science and Technology, Computer Science and Electrical Engineering in Kumamoto University, Japan, on September 2017 in partial fulfillment of the requirements for the degree of Master of Engineering but was not published thus the copyright remained with me "Fajar Purnama" the main author where I have the authority to repost anywhere and I claimed full responsibility detached from Kumamoto University. Therefore, I hereby declare to license it as customized CC-BY-SA where you are also allowed to sell my contents but with a condition that you must mention that the free and open version is available here. In summary, the mention must contain the keyword "free" and "open" and the location such as the link to this content.
  • The presentation is available at Slide Share.
  • The source code is available at Github.

Below are the publications reused in this thesis that does not require copyright clearance:

Below are the publications reused in this thesis that requires copyright clearance:

https://file.army/i/B4GsgSq
Incremental Synchronization Implementation on Survey using Hand Carry Server Raspberry Pi [9]
https://file.army/i/B4GsNPZ
Rsync and Rdiff implementation on Moodle's backup and restore feature for course synchronization over the network [12]

Abstract

The continuous advance of electronics and information communication technologies (ICT) have influenced every aspects greatly, on this thesis is discussed on education aspect. Electronics and ICTs have been incorporated into the learning and teaching process, giving birth to electronic learning (e-learning). Inside, there is a well known term called online course where the essence is being able to deliver courses distantly with flexibility in place and time. However a simple condition must be met in order to implement online course, and that is the sufficiency of ICT infrastructure. Unfortunately not all regions met this condition, limiting the accessibility of online course. Other than improving the ICT infrastructure, distributed learning management system (LMS) was proposed as alternative, but the next issue was the maintenance or synchronization, which in this case is keeping the learning contents up to date. There are two problems highlighted in this thesis which are unable to perform synchronization in severe network connectivity region, and duplicate data transfer during synchronization.

To overcome the synchronization in severe network connectivity region the solution is utilizing hand carry servers. By implementing hand carry servers on distributed LMS will grant mobility to the servers of distributed LMS. The concept proposed was having the hand carry server to physically seek network connectivity to perform online synchronization, and afterwards returns to its original location. The hand carry server was proved to be portable due to its small size, light weight, and also power consumption where a power bank is enough to supply for a whole day. Although it has resource limitations in terms of computer processing unit and random access memory which limits its performance.

To overcome duplicate data transfer during synchronization incremental synchronization was utilized instead of full synchronization. Also on this thesis introduced a new approach called dump and upload based sychronization which was to overcome the obstacles of different LMSs and LMS versions faced by dynamic content sychronization.

Table of Contents

List of Figures

List of Tables

  1. Introduction
    1. Background
    2. Problem
    3. Hypothesis
    4. Significance
    5. Objective
    6. Contribution
    7. Limitation
    8. Structure of the thesis
  2. Portable Distributed LMS
    1. Distributed Systems
      1. Partitioned System
      2. Replicated Sytem
    2. Distributed Learning Management System
    3. Hand Carry Server in Distributed LMS
      1. Portability of Hand Carry Server
      2. Synchronization in Severe Network Connection
    4. Limitation of Hand Carry Server
      1. Resource
      2. Stress Testing
  3. Distributed LMS Synchronization
    1. Learning Content Sharing
    2. Full Synchronization versus Incremental Synchronization
      1. Full Synchronization
      2. Incremental Synchronization
      3. Dynamic Content Synchronization on Moodle
    3. Dump and Upload Based Synchronization
      1. Export and Import Feature
      2. Rsync a Blocked Based Remote Differential Algorithm
      3. Experiment Result and Evaluation
      4. Advantage of Dump and Upload Based Synchronization
    4. Conclusion and Future Work
      1. Conclusion
      2. Future Work

Acknowledgement

References

List of Figures

  1. In Chapter 1:
    1. Illustration of e-learning showing many electronic devices to beused (images from openclipart [1].
    2. Illustration of the difference between conventional course and online course. While conventional course is restricted by place andtime, online course can be anywhere and anytime (images fromopenclipart [1].
  2. In Chapter 2:
    1. Illustration of main benefit of distributed system using ICT penetration map of Indonesia in 2012, where more green regions showed good network connectivity and more red regions showed the opposite. (a) People on regions with more red colored will have difficulty in accessing the central server. (b) On the other hand peoplewill have not difficulty in accessing if there are servers on their local regions.
    2. Illustration of using hand carry computer device to gather informa-tions from other users inputed from their own computer device [10].
    3. Time consumption of survey process from preparation, responding,to post survey [10]. (a) For paper based method the preparation consists of question typing and question printing, responding consists of question distribution, question answering, and responsecollection, and Post Survey consists of response insertion. (b) Forhand carry server method the preparation consists of question typing with web delays, responding consists of server connection, question answering with web delay, and the advantage of this method isno need for post survey which the response already automatically inserted.
    4. Data in form of bar graph and pie chart was shown the instancethe hand carry server received the responses [10]. Only 4/30 item result shown here since it is too much to show all.
    5. Illustration of moving hand carry servers where they have to move to a location with network connectivity to synchronize with main server, and return to original location after finishing [9].
    6. Implementation illustration of hand carry server on distributed LMS in Indonesia. (a) Servers on more red areas have difficulty on their network connectivity. (b) Replacing those servers with hand carry servers renders them to be physically mobile and able to search for network connectivity.
    7. Resource usage during survey attempted by 30 users showing mostly over 80% of CPU usage and around 700MB of RAM usage [10].
    8. Stress testing illustration using Funkload software application that generates up to 100 virtual users to stress the hand carry server(images from openclipart [1]).
    9. Stress testing showing increasing response time to increasing number of virtual users and increasing number of questionnaire items[10], (a) average response time while (b) maximum response time.
  3. In Chapter 3:
    1. Illustration of full synchronization of learning contents in courses. Initial stage is learning content sharing where 100 mega bytes (MB) of course is shared. Next stage is update where there is 800MB of new data but whole 900MB is transfered which 100MB is aduplicate data. On next update there is 100MB of new data but whole 1GB is transfered which 900MB is duplicate data.
    2. Incremental synchronization different from Figure 3.1 where the duplicate data are filtered.
    3. Dynamic content synchronization model for Moodle [11]. The course packer converts both Moodle tables into synchronization tables. Then the synchronizer checks for inconsistency between the two tables which in the end applies the difference between both synchronization table to the slaves synchronization table. Finally the synchronization table is reconverted into Moodle table and that is how it is synchronized.
    4. The dump and upload based synchronization model. Both servers’ LMS will dump/export the desired learning contents (in this case packed into a course) into archives/files. The synchronizer will perform differential synchronization between the two archives. After synchronization the archives will be imported/uploaded into the servers’ LMS, updating the learning contents.
    5. Screenshot of Moodle’s export feature, (a) showed options like include accounts, and (b) showed learning contents to choose to export.
    6. First step is to generate a signature of archive on slave and send to master. The signature of is used on master’s archive to generate delta/patch or can be called the difference and have it sent to slave. Slave will apply/use that delta/patch on its archive and producean archive identical to the one on master.
    7. Assume two archives where the outdated archive on slave have only second topic, and latest archive on master have all three topics. Here for example outdated archive is divided into three blocks, andthree sets of checksums are obtained and bundled into a signature. The signauture is then sent to master.
    8. Illustration of identifying difference. (a) The three sets of check-sums are compared in rolling with other blocks on new archive. Identical blocks to the first and second sets of checksums are found and the locations are recorded while no matching block is found for the third set of checksums which will be marked for delete. (b) The delta is generated on master containing instructions to rearrange identical blocks, delete unfounded blocks, and append newblocks, which will be send and applied on slave.
    9. After the delta/patch is applied, slave will have identical archive to master.
    10. Implementation of some download manager techniques into rsync algorithm based synchronization. Delta is split into pieces and retrieved by the client. The integrity of the pieces are checkedusing cheksum, here is MD5 and if inconsistent it will redownload those pieces. In the end the pieces are merged. This can also be implemented on uplink side when sending the signature.
    11. Test result showing the relationship between block size, signature, and delta. When the block size increases the signature size decreases, but the opposite for delta which it increases. The full file is the size of the file to be downloaded without using differential method, in other words using full synchronization. The transmission cost if using incremental synchronization is the sum of signature and delta which on this case is when the block size is 512 bytes when it is optimal.
    12. Network traffic generated based on the four scenarios of the experiment. Full sychronization generates the most network traffic shown in blue bars. The orange and yellow bar is network traffic of incremental synchronization depending on the size of contents to be updated which lower are generated compared to full synchronization. The green bars showed incremental synchronization execution when there is no update and the results are very low and tolerable.

List of Tables

  1. In Chapter 1:
  2. In Chapter 2:
    1. List of LMS known categorize by open source, Cloud, or Proprietary.
    2. Specification of the hand carry computer Raspberry Pi 2 Model B.
  3. In Chapter 3:
    1. Size of course contents of the same course on different LMS, show-ing sizes when in contains one, two, and three topics.
    2. Detail experiment result of Figure 3.12 showing size of signature and delta during incremental synchronization scenarios on each LMSs.
    3. Experiment result of delta size compared to ideal size, and percentage of duplicate eliminated was formulated from these data.

1 Introduction

1.1 Background

Electronics and Information Communcation Technology (ICT) have made many tasks more convenient, including delivering education. It can be seen that many have incorporated electronics in their learning and teaching process. There are few examples such as teachers using laptops and projectors to present their materials, students browsing the Internet to search for informations, and both of them using emails, chats, or social networking service to communicate. These kind of things are agreed to be called electronic learning (e-learning) which can be illustrated on Figure 1.1

https://file.army/i/B4GsqFF
Figure 1.1 Illustration of e-learning showing many electronic devices to be used (imagesfrom openclipart [1]).

Though, this thesis will not discuss widely on e-learning, but a category which is part of e-learning called online course. It uses electronic ICT devices where information exchange can be done remotely. Information can be delivered through electrical signal in high speed on the network, preferably on the Internet, and computer devices as end devices or as transmitters and receivers. Simply computer devices connected to the Internet are all that are needed to participate in online course from anywhere at anytime illustrated on Figure 1.2.

https://file.army/i/B4GstNe
Figure 1.2 Illustration of the difference between conventional course and online course. While conventional course is restricted by place and time, online course can be anywhere and anytime (images from openclipart [1]).

Online course is now being highlighted by many parties, seeing them as one solution to the unevenly distribution of education. Straighfowardly not everyone have access to good quality education, furthermore there are also those who does not have access, and by using online course people can receive education without going to school. Knowing this, our peers tried to implement online course in their Universities, one in Indonesia [2] and the other one in Myanmar [3]. Another peer already have online course well built in Mongolia and now moving to massive open online course (MOOC) [4]. Unlike private online course only for students in Universities, MOOC is open for anyone indiscriminately. In the United States (US) MOOC is also being used to scout for potential students. For example Massachusetts Institute of Technology (MIT) found a genius Mongolian highschool student who perfectly ace its Circuits and Electronics MOOC, then took him as a freshmen student [5]. In summary many people saw bright future in utilizing online course in education.

With all the benefits of online course, there are still problems preventing many people from enjoying it. The problem was the lack of accessability to online course due to insufficient ICT infrastructure. In other words there are people who are having network connectivity issue especially in developing countries. On random survey by Kusumo et al. [5] on students in Indonesia, 60% of them agreed that Internet connection is still problematic. The survey by Monmon et al. [3] of e-readiness on Yangon Technological University and Mandalay Technological University in Myanmmar showed lower Likert scale scores on the students' and teachers' perception on ICT network compared to other items. Today the world Internet penetration is still around 50% indicating that only half of the world's population can access online course [7]. Eventhough these people have access, their access quality may still be questionable which can lead to disatisfaction in accessing online course.

The obvious solution to accessibility issue is to improve the ICT infrastruture, however this takes a long time. Therefore another method was implemented, which is implementation of distributed system rather than centralized system. The concept is to have the people to access the service on their local area that is distantly closer than on the central area that is distantly further. In some references, it is stated as the third generation of content management system (CMS) [8], thought on this work is more about learning contents of Learning management system (LMS) than general contents of CMS.

1.2 Problem

With distributed LMS as the solution to the lack of accessability of online course, it is the next problem which is discussed on this thesis. The problem is the synchronization which is to keep the learning contents up to date. This can also be said as the maintenance of the learning contents. Specifically there are two problems highlighted on this thesis as follow:

  1. The lack of network connectivity for synchronization. Usually synchronization are set to be done online where the servers synchronizes with another in order to keep the learning contents to its latest version. If this was the case then synchronization is not possible on no network connectivity condition.
  2. Duplicate data transfer during synchronization. In default full synchronization is used, where the learning contents is usually in bundle of courses. Commonly when the contents of the course is revised on LMS, the whole contents of course is distributed to other servers including previously distributed contents (duplicate data). In this case, there will be many redundant data which will add more burdens to the network.

1.3 Hypothesis

This thesis provides two main solutions for the two problems:

  1. For the first problem of no network connectivity, the solution is to provide portability function to distributed LMS. Straightforwardly enabling the servers to move to other locations where there is network connectivity to synchronize, and to return to its original location after finish synchronizing.
  2. For the second problem of duplicate data is to utilize incremental synchronization through continuous differential synchronization technique. The new contents are to be identified before synchronization and only the new contents are distributed, leaving out the redundant data.

1.4 Significance

Detail significances are discussed in further sections, but in general can be mentioned as follow:

  1. Possibility of flexible synchronization in severe network connectivity region by mobilizing the servers of distributed LMS. It can also be pictured as widening the network coverage.
  2. Lower network cost can be achieved from incremental synchronization.

1.5 Objective

The objective of this research is to enable online synchronization of distributed LMS in almost no network connectivity region and reduce redundant data transfer during synchronization.

1.6 Contribution

  1. Introduced a novel concept of integrating hand carry server to distributed LMS which makes it mobile or portable resulting in able to perform synchronization in regions with severe network [9]. This thesis also demonstrated the portability of hand carry servers' through conducting survey simulation and on the other hand also showed its limitation through stress testing [10].
  2. Though the novelty of incremental synchronization in distributed LMS was already claimed [11], this thesis showed a different approach call dump and upload based synchronization [12] which the advantages of its single software application is compatible to most LMS and benefits the feature of that LMS, for example its privacy and security feature which automatically makes the synchronization private and secure, and on Moodle possibility of partial synchronization due to micronization of course contents into blocks. Another advantage is this approach supports bidirectional synchronization.

1.7 Limitation

Each method may have limitations which is discussed in detail on each of their respective sections, but here is mentioned the general limitation of this research:

  1. The system is only experimented in laboratory and not yet implemented in real running online courses. The experiment is done on the author's virtual machines, laboratory's local area network (LAN), and free public clouds owned by the author.
  2. Only one hand carry server was used in actual experiment and the expansion discussed of using more the one of it is still a concept derived from the experiment.
  3. This thesis' dump and upload based incremental synchronization is novel in its concept but not in its software application since it only make use of existing software applications. They are the export and import feature in LMS to dump the learning contents and rdiff application based on rsync to identify the difference between dumps.
  4. The course experimented on is the authors self created course which was never delivered, in short it is not an actual running course.

1.8 Structure of the thesis

Beyond this section the thesis contains three more chapters:

  1. Chapter 2 discussed about portable distributed LMS which in order gives brief introduction to distributed LMS, afterward is the author's work in showing the convenience of hand carry server [10], the concept of hand carry server in distributed LMS [9], and laslty the hand carry server's limitations.
  2. Chapter 3 discussed about incremental data synchronization which in order the story of sharing learning contents, distinguishing full synchronization to differential and incremental synchronization, discussion of the previous work of dynamic content synchronization [11] versus the author's work of dump and upload based synchronization [12], and finally experiments and results showing the percentage of duplicate data eliminated on incremental synchronization.
  3. Chapter 4 is the conclusion of this thesis that also discussed the future work.

2 Portable Distributed LMS

2.1 Distributed Systems

2.1.1 Partitioned System

Distributed systems can be a wide discussion with different implementation [8]. One implementation can be as partitioned system. For example, an organization's network can have their servers separated, where the database, directory, domain name service (DNS), dynamic host configuration protocol (DHCP), file, web, and each other servers on separated machine. They are integrated but independent where if one service (server) is damage, will not damage other services. A different example is data partitioning where data are fragmented that when retrieving data, they have to be gathered and merged. This usually happens in collaboration where people are working on the same project but from different machines.

2.1.2 Replicated System

Another implementation can be as replicated system, and this is the one that is referred or used on this thesis. The urgency for replicated system can be due to bottleneck traffic or geographically severe network connectivity, or both. One of the most popular implementation is search engine like Google and Yahoo where they have different server locations assigned with local domains for example .co.jp for Japan, .co.id for Indonesia, and etc. Not as well known as search engines are online multiplayer games. The servers of online multiplayer games can reside on many regions such as Asia, Europe, United States, China, etc. There are games that shows the number of population on each servers indicating whether it is full or not. Players can choose other servers when a server reached the population limit or when players cannot actually reach the server on that region.

2.2 Distributed Learning Management System

One definition of LMS is a system that manages the learning and teaching specifically for online case. The current form of LMS today is a software application. It is not just delivering learning materials to students but online computerize any activities that can happen in a class. Some activities are interractions whether by chat applications or forums like on social networking service (SNS), assignments where this time is submitted electronically through LMS by uploading their files, and quizzes or examinations which can be automatically or manually graded. Not to forget that it can be accessed from anywhere at anytime, and computers are used which can perform much faster and automatic tasks than humans, makes it possible for unique applications, data minings, and learning analytics. In short new features are being developed everyday. Today exists many LMS as on Table 2.1 whether they are open source (free to use, modify, with all the codes open), only available on clouds or software as a service (SAAS) which tends to be freeware/usage only, or proprietary which tends to be business/commercial/paid. On the author's surroundings mostly Moodle is used.

Table 2.1 List of LMS known categorize by open source, Cloud, or Proprietary.
Open Source aTutor, Canvas, Chamilo, Claroline, eFront, ILIAS, LAMS, LON-CAPA, Moodle, OLAT, OpenOLAT, Sakai, SWAD, Totara LMS, WeBWorK
SAAS/Cloud Cornerstone OnDemand Inc, Docebo LMS, Google Classroom, Grovo, Halogen Software, Informetica, Inquisiq R3, Kannu, Latitude Learning, Litmos, Talent LMS, Paradiso LMS, TOPYX, TrainCaster LMS,WizIQ LinkStreet
Proprietary Blackboard Learning System, CERTPOINT Systems Inc, Desire2Learn,eCollege, Edmodo, Engrade, WizIQ, GlobalScholar, Glow, HotChalk,Informetica, ITWorx CLG, JoomlaLMS, Kannu, Latitude Learning LLC,Uzity, SAP, Schoology, SSLearn, Spongelab, Skillsoft, EduNxt,SuccessFactors, SumTotal Systems, Taleo, Teachable, Vitalect

The term distributed LMS means that the replicated servers contains LMS. Each servers are meant to service online course. The implementation can be a full replication where not only learning contents but everything else including activities, assessments, and interractions are synchronized. This means students and teachers can freely use any servers recommended to the one with best network connectivity. The other implementation is partial replication where only non-private data are synchronized, usually only the learning contents. This can happen when there are jurisdictions where each regions are to be handled locally. In other words contents are provided but each schools and universities are still the owner of their own servers and asserts local authorities. Either way distributed system is the solution for bottleneck and connectivity issue. As an illustration on Figure 2.1 in Indonesia, it is better to build and spread more servers compared to have a centralized server in the capital city.

https://file.army/i/B4GsP85 https://file.army/i/B4GsUQA
Figure 2.1 Illustration of main benefit of distributed system using ICT penetration map of Indonesia in 2012, where more green regions showed good network connectivity and more red regions showed the opposite. (a) People on regions with more red colored will have difficulty in accessing the central server. (b) On the other hand people will have not difficulty in accessing if there are servers on their local regions.

2.3 Hand Carry Server in Distributed LMS

After the establishment of distributed LMS, the contents needs to be maintained or to be kept up to date through synchronization. However the problem is the lack of network connectivity between servers usually found in deeper areas such as schools in villages. It may be easy to build a LAN but difficult to build connections to other servers or simply an Internet connection on distant places. In a short time it is only possible to build a very limited connection (very low speed) which retrieval of contents may seem to take forever if it is very large. The metaphor is building a server in a jungle, a remote island, or a desert, which are very isolated. The default solution is offline synchronization or the author's solution server mobilization [9].

2.3.1 Portability of Hand Carry Server

Before discussion of the synchronization, this section would like to introduce hand carry servers. On this thesis it is called hand carry server because the physical hardware is a computer with the size of a regular human hand that has been configured into a server. It is called a mini, pocket size, or portable computer, one example on this thesis is used Raspberry Pi 2 with the specification on Table 2.2.

Table 2.2 Specification of the hand carry computer Raspberry Pi 2 Model B.
Specification
A 900MHz quad-core ARM Cortex-A7 CPU
1 Giga Byte (GB) Random Access Memory (RAM)
4 Universal Serial Bus (USB) ports
40 General Purpose Input Output (GPIO) pins
Ethernet Port
Camera Serial Interface (CSI)
Display Serial Interface (DSI)
Micro Serial Digital (SD) card slot
Video Core IV 3D graphics cire
Size of 85.60 mm × 56.5 mm (3.370 in × 2.224 in), not including protruding connectors
Weight of 45g

The portability was demonstrated on one of the author's previous work [10]. It is less related to distributed system but it showed applications of hand carry server in manual labors which on that work is a simulation comparing between paper based method survey to hand carry server method survey. The motivation was the lack of Internet connection to perform online survey but most people owns a computer devices in developing countries [3] [7] [13]. Instead of reverting to paper based method, the participants' personal digital assistants (PDAs) can be utilized by connecting them to the hand carry server and perform a semi-online survey illustrated on Figure 2.2.

https://file.army/i/B4Gsib4
Figure 2.2 Illustration of using hand carry computer device to gather informations from other users inputed from their own computer device [10].

For the simulation a MOOC readiness survey [4]. consist of 30 questionnaire items was simulated on 30 participants by a surveyor. The whole survey consists of three stages; preparation, responding, and post survey. On the preparation stage, for paper based method the surveyor creates the questionnaire items on word processing software then print them, while for hand carry server method the surveyor creates the questionnaire on web based survey application called Limesurvey CMS. On responding stage, for paper based method the surveyor hands out paper to each participants and collect them when they are finish responding, while for hand carry server method the surveyor tells the participants to connect their PDAs to the hand carry server and informs the URL of the local survey site, then waits until the participants submits their results to the hand carry server. Though results on Figure 2.3 showed no difference in time consumption for preparation and responding stage, paper based method tends to burden more on labors such as printing the questionnaires (time taken multiply greatly using old printers) and carrying heavy papers if there are alot of participants. On the other hand resource is the main issue for hand carry server which will be discussed on Limitation of Hand Carry Server section.

https://file.army/i/B4GsyBn https://file.army/i/B4GJ9l1
Figure 2.3 Time consumption of survey process from preparation, responding, to post survey [10]. (a) For paper based method the preparation consists of question typing and question printing, responding consists of question distribution, question answering, and response collection, and Post Survey consists of response insertion. (b) For hand carry server method the preparation consists of question typing with web delays, responding consists of server connection, question answering with web delay, and the advantage of this method is no need for post survey which the response already automatically inserted.

However the advantage was shown on the post survey stage where usually the surveyors have to input the responses into the database, not to forget to also handle human errors by verifications such as double checking which seems to be the most stressing and tiring proses of paper based method. It is different from hand carry server method where the responses are automatically processed, literally no post survey stage. In fact results/statistics are instantly visible which no manual method can outfast. The participants can see the current statistics the moment they submitted the responses as exampled on Figure 2.4.

https://file.army/i/B4GJBm7
Figure 2.4 Data in form of bar graph and pie chart was shown the instance the hand carry server received the responses [10]. Only 4/30 item result shown here since it is too much to show all.

The author's work mostly discussed the convenience of computerization but the important part is the mobility or portability [10]. Back on Figure Figure 2.2., the hand carry server can be carried anywhere (a walking/moving server) which only needs a power supply of direct current (DC) 5V (volts) potential difference and 2A (amperes) electric current, usually a hand carry power bank is enough. On the simulation is also measured the current delivery was 0.6AH (ampere hour) in 39 minutes (whole duration of survey, see Figure Figure 2.3) meaning with the powerbank's specification of 20000AH it will last 20 hours. In short the hand carry server is low power cost that can last longer during mobile.

2.3.2 Synchronization in Severe Network Connection

Currently synchronization have to be to taken offline when there is no network connectivity whether they are full or incremental which will be discussed in next chapter. An administrator will go to network connected or directly to the updated server to retrieve the contents and store in a storage media such as compact disc (CD), and flash drive. Then travel back to the outdated server, insert the storage media and give the contents. There is a work by Ijtihadie et al. [14] for differential update where it was sent through email then differentially update the contents. It should be possible to put the differentials into a storage media which then to be inserted into the outdated server to update the contents.

https://file.army/i/B4GJ7Ps
Figure 2.5 Illustration of moving hand carry servers where they have to move to a location with network connectivity to synchronize with main server, and return to original location after finishing [9].

Another way is to move the servers to an area with connectivity, have it update, and then return it to its original location [9]. This was actually inspired by Ijtihadie et al. [15] where the students downloads the quiz on their mobile devices, answers them offline at their homes, and later finds an Internet connection to synchronize (automatically upload their answers). This concept was applied to this thesis' work where the process happens to the hand carry server instead of the mobile device. It is illustrated on Figure 2.5 with currently people carrying the servers. An example of implementation is on Figure 2.6. There are regions in Indonesia which does not have goot network connectivity rendering difficult to synchronize with other servers. If those servers are replaced with hand carry servers, then it can physically move to find network connectivity (it supports wired and wireless connection) to synchronize, and in the end return to its original location.

https://file.army/i/B4GJYZl https://file.army/i/B4GJzFj
Figure 2.6 Implementation illustration of hand carry server on distributed LMS in Indonesia. (a) Servers on more red areas have difficulty on their network connectivity. (b) Replacing those servers with hand carry servers renders them to be physically mobile and able to search for network connectivity.

Within the distributed LMS, the servers can either be replaced with hand carry servers or leave them mounted and have hand carry servers as addition or support, meaning the hand carry servers will travel from servers to servers. It is temporary implementation when there are no network infrastructures built, since it is fast and simple to install, or it can serve as a purpose to cover network coverage holes where the hand carry server moves around these network uncovered area.

2.4 Limitation of Hand Carry Server

2.4.1 Resource

With the compressed size and light weight of hand carry server, it has resource limitation. The resources responsible for servicing are mainly computer processing unit (CPU) and random access memory (RAM) (detailed specification can be seen back on Table Table 2.2). As shown on Figure 2.7 the CPU and RAM are already exhausted when 30 participants attempts the survey [10]. These measurement result alone may not show much meaning, but can be meaningful if stress testing is conducted as on next subsection.

https://file.army/i/B4GJ4Nk https://file.army/i/B4GJKKv
Figure 2.7 Resource usage during survey attempted by 30 users showing mostly over 80% of CPU usage and around 700MB of RAM usage [10].

2.4.2 Stress Testing

Experience users may completely understand by just showing the resource measurement results, but others will have to feel, rub, and take few trials to see how far this hand carry server is actually capable. For that reason, stress testing was proposed and conducted. Though it was tested for survey purpose [10], but the method can be applicable for other applications. For the stress testing, a web stress testing software application called Funkload was used. Different numbers of virtual users incrementally 10 up to 100 was generated and attempts survey on the hand carry server simultaneously Illustrated on Figure 2.8. This time only response time was measured.

https://file.army/i/B4GJRQH
Figure 2.8 Stress testing illustration using Funkload software application that generates up to 100 virtual users to stress the hand carry server (images from openclipart [1]).

Response time can be refered to service time, in this case how much users takes to load questionnaire items and to submit responses. The service time can also be called queuing time where there are users who takes shorter time and users who takes longer time as on Figure 2.9 are shown the average response time and the maximum response time (the user on the last queue). It shows that the response time increases to the number users and also increases when the questionnaire content size increases because it will affect on the number of questionnaire items to be retrieved and how much responses that have to be submitted. Through this results, the surveryor can decide the target average response time and tolerable maximum response time. Then the number of users and questionnaire items simultaneously can be determined. Though the result also showed that the hand carry server have reached its limit above 85 concurrent users and 30 questionnaire items which the service stops working and must be restarted.

https://file.army/i/B4GJebf https://file.army/i/B4GJI7I
Figure 2.9 Stress testing showing increasing response time to increasing number of virtual users and increasing number of questionnaire items [10], (a) average response time while (b) maximum response time.

3 Distributed LMS Synchronization

3.1 Learning Content Sharing

Before going to the main discussion of synchronization, it is better to discuss about learning content sharing. Sharing learning contents became popular ever since MOOC was introduced. A course "Moodle on MOOC" conducted periodically teaches students how to use Moodle and advised them to share their finished courses [16]. Making a well designed and written learning contents for online course from a scratch may consume a lot of time, learning content sharing helps other instructors to quickly develop their own. Some specialized courses may only be written by experts. Learning content sharing reduces the burden of the teacher to create learning contents for online courses, and the more the existence of online courses can give more students from all over the world a better chance to access a quality education.

Distributed LMS is also another form of learning content sharing where the learning contents are shared to other servers on other regions. The typical way of learning content sharing is dump, copy, then upload. Most LMS have a feature to export their course contents into an archive and allows to import the contents to another server which have the LMS. The technique to export and import varies to systems but the concept is to synchronize the directory structure and database. There is a very high demand for this feature that it is still improving until now, for example being able to export user defined part of the contents is being developed. Other LMS that currently does not have this feature will be developed as it is stated on its developer forum.

3.2 Full Synchronization versus Incremental Synchronization

3.2.1 Full Synchronization

Synchronization can be defined as similar movements between two or more systems which are temporally aligned, though on this case is the action of causing a set of data or files to remain identical in more than one location. The data or files are learning contents and private data, although private data are usually excluded. The term full synchronization defined on this thesis is the distribution of the whole data consists of new data and existed data. Synchronization occurs when new data are present to update the data of other servers. Illustrated on Figure 3.1 the full synchronization includes existed or duplicated data which deems to be redundant that only adds unnecessary burden to the network. However full synchronization are more reliable because each full data are available.

https://file.army/i/B4GJSlV
Figure 3.1 Illustration of full synchronization of learning contents in courses. Initial stage is learning content sharing where 100 mega bytes (MB) of course is shared. Next stage is update where there is 800MB of new data but whole 900MB is transfered which 100MB is a duplicate data. On next update there is 100MB of new data but whole 1GB is transfered which 900MB is duplicate data.

3.2.2 Incremental Synchronization

Ideally the duplicate data are to be filtered out and not to be distributed for highest efficiency. The conventional way is the recording approach where the changes done by the authors of the course are recorded. The changes can only and either be additions or deletions of certain locations. This actions are recorded and sent to other servers and have them execute the actions to achieve identical learning contents, which is similar to push mechanism where the main server forces updates on other servers. Accurate changes can be obtained but unrecoverable from error because the process is unrepeatable. Another issue is its restriction that no modification must take place on the learning contents of other servers, meaning the slightest change, corruption, or mutation can render the servers unsynchronizable.

Instead of the recording approach, the calculating approach is more popular due to its repeatable process and less restriction. The approach is to calculate the difference between the new and outdated learning contents. Therefore the process of the approach can be done repeatedly and some changes, corruption, or mutation on either learning contents does not prevent the synchronization. One of the origins of the calculating approach is file differential algorithm developed in Bell Laboratory [17] which today known as diff utility in Unix. The detailed algorithm may seem complicated, though in summary consists of extracting the common longest subsequence of characters in each line between the two files (more like finding the similarity between two files), afterwards the rest of the characters on the old file will be deleted while on the new file the characters will be added to the common longest subsequence on the correct location, resulting in update of the old file. For large files hashings were involved.

Applying the file differential algorithm on the synchronization will make it differential synchronization. Unlike full synchronization, differential synchronization is the distribution of only the new data. The repetition of differential synchronization will make it incremental synchronization which is the repetitive distribution of only the new data. In sense the synchronization will be incremental because only the updates are sent every time. Another way to put it, increment means to add up where the learning contents adds up to every differential updates. Ultimately duplicate data or learning contents will be filtered out, reducing unnecessary burdens on the network illustrated on Figure 3.2.

https://file.army/i/B4GaJso
Figure 3.2 Incremental synchronization different from Figure Figure 3.1 where the duplicate data are filtered.

3.2.3 Dynamic Content Synchronization on Moodle

The idea of implementing differential synchronization on distributed LMS started by Usagawa et al. [18], which then continued by Ijtihadie et al. [11] [19]. These works still limits themselves to distributed Moodle system because it solely focuses on Moodle structure. When writing the software application, it is necessary to identify the database tables and directories of the learning contents. The incremental synchronization between two Moodle systems was described as dynamic content synchronization [11] where the learning contents are constantly being updated. The dynamic synchronization is unidirectional or simplex in terms of communication model where it is fixed that one Moodle system acts as a master to distribute the updates and another one acts as a slave to receive the updates.

File differential algorithm was applied to maintain consistencies on both master's and slave's database tables and directories. The database tables and directories are assigned with hashes [11]. Information of those hashes are exchanged between master and slave, identical hashes meaning thoses contents should not be change, and on the other hand mismatch hashes meaning those contents should be updated. Though Ijtihadie et al. [11] developed their own algorithm stated specifically for synchronization of learning contents between LMS, it is not much different from existing remote differential file synchronization algorithm such as [20].

The moodle tables on the database is converted into synchronization tables as on Figure 3.3 through means of hashing. Only contents related to the selected course was converted and sorted on the course packer. Privacy was highly regarded, thus private data was filtered. The purpose is to find inconsistencies on the database between master and slave. Stated on the previous paragraph, hashes are oftenly used to test inconsistencies, if the hashes are different then they are inconsistent and vice versa. When inconsistencies on a certain table is found, the master sends its table to the slave replacing the slave's table which in the end will become consistent. In the end the synchronization tables are reverted back into Moodle tables. In summary dynamic content synchronization only takes place on parts of the database and directories that changes or inconsistent.

https://file.army/i/B4GJZmp
Figure 3.3 Dynamic content synchronization model for Moodle [11]. The course packer converts both Moodle tables into synchronization tables. Then the synchronizer checks for inconsistency between the two tables which in the end applies the difference between both synchronization table to the slave’s synchronization table. Finally the synchronization table is reconverted into Moodle table and that is how it is synchronized.

3.3 Dump and Upload Based Synchronization

The dynamic content synchronization [11] software application was written solely for Moodle, and back then was written for Moodle version 1.9. Later on Moodle rises to version 2.0, with major changes on database and directory structure. The software application have to be changed to suit the new Moodle version [19], but the concept of synchronization remains the same. Moodle continues to develop, until now it is version 3.3, though sadly the dynamic content synchronization software application was discontinued on Moodle version 2.0. The author originally tried to continue the software application but found a better approach named dump and upload based synchronization model [12] on Figure 3.4. Unlike dynamic content synchronization, the dump and upload based synchronization is bidirectional but limited to half duplex communication model. In other words each can play as both master and slave, but only one at time. For example, on first synchronization one server can play as the master while others as slaves, and on second synchronization the master can switch into a slave and one of the slaves can switch into a master. Another thing is that the synchronization uses pull mechanism where the slave checks and requests updates to the master. It is considered more flexible than the push mechanism where the master forcefully update the slaves.

https://file.army/i/B4GJuUK
Figure 3.4 The dump and upload based synchronization model. Both servers' LMS will dump/export the desired learning contents (in this case packed into a course) into archives/files. The synchronizer will perform differential synchronization between the two archives. After synchronization the archives will be imported/uploaded into the servers' LMS, updating the learning contents.

3.3.1 Export and Import Feature

While dynamic content synchronization handles everything from a scratch, the dump and upload based synchronization utilizes the export and import feature that exists in most LMS. It is a feature mainly to export and import learning contents categorized into courses which can also be called course contents. The export feature outputs the course content's database tables and directories into a structured format. Then the import feature reads the format and inserts the data into the correct database tables and directories. Formats may differ from one LMS to another but the method is most likely the same.

Other features are export and import of course lists, user accounts, and probably more others but not known and used on this thesis. One of the best export and import is on Moodle where further splitting is possible on the course contents while on other LMS have to dump the whole course. This way people can choose to get only the contents they are interested in. This opens a path for partial synchronization where only specific contents or parts of the course are synchronized. Another advantage is the option to choose to include, not to include private data, or include private data but anonymized, in other words it supports privacy. In summary Moodle's export and import feature's advantage compared to other LMSs' is the ability to secure private data, and split course contents into blocks or micros screenshot on Figure 3.5. This thesis highly recommends other LMSs' export and import feature to follow Moodle's footsteps.

https://file.army/i/B4GJ6ZD https://file.army/i/B4GJEdQ
Figure 3.5 Screenshot of Moodle's export feature, (a) showed options like include accounts, and (b) showed learning contents to choose to export.

3.3.2 Rsync a Blocked Based Remote Differential Algorithm

With the pervious subsection explained that course contents can be dumped using the export and import feature, the next step is performing remote differential synchronization between the two archives. The author chose not to develop an algorithm but used an existing algorithm called rsync [20]. The author also did not write a program to perform rsync but use the already existing program based on the rsync library (librsync). What the author did is just implementing this program to work on hyper text transfer protocol (HTTP) or on web browsers since LMS are usually web based (rsync is mostly used on secure shell (SSH)). There are three general steps of performing rsync algorithm between the two archives located on different servers as on Figure 3.6, and details are as follow:

https://file.army/i/B4GJOVY
Figure 3.6 First step is to generate a signature of archive on slave and send to master. The signature of is used on master's archive to generate delta/patch or can be called the difference and have it sent to slave. Slave will apply/use that delta/patch on its archive and produce an archive identical to the one on master.
  1. The archive to be updated is divided into blocks with each blocks calculated and assigned two types of hash or checksums. The checksums are weak rolling checksum for example Adler-32 and strong checksum for example Black2, and MD5. The checksums are bundled into a signature and sent to the other server. The user can determine the size of divided blocks which can affect the accuracy of finding difference. Figure 3.7 illustrates this step.
    https://file.army/i/B4GJGKa
    Figure 3.7 Assume two archives where the outdated archive on slave have only second topic, and latest archive on master have all three topics. Here for example outdated archive is divided into three blocks, and three sets of checksums are obtained and bundled into a signature. The signauture is then sent to master.
  2. The signature is then used on the latest version archive on the server with latest version of archive. First a weak checksum is checked in rolling block. Second if a block's weak checksum is identical then comparison of the two strong checksums is done to verify whether the block is really identical or not. For blocks with identical checksums, their locations are recorded, while other blocks are regarded as new blocks which should be sent to the server with outdated archive. Checksums on signature with no matching blocks found on archive with latest verstion, the blocks of the outdated archive that generated this checksum will be regarded as deleted. Based on all of these information a delta/patch is generated containing instructions to alter the blocks of the outdated archive and new blocks to be inserted there. This step is illustrated on Figure 3.8.
    https://file.army/i/B4GJLXL https://file.army/i/B4GJhn9
    Figure 3.8 Illustration of identifying difference. (a) The three sets of checksums are compared in rolling with other blocks on new archive. Identical blocks to the first and second sets of checksums are found and the locations are recorded while no matching block is found for the third set of checksums which will be marked for delete. (b) The delta is generated on master containing instructions to rearrange identical blocks, delete unfounded blocks, and append new blocks, which will be send and applied on slave.
  3. The delta/patch is sent to the server with outdated archive, applying it to its archive, constructing identical archive to the latest version one as on Figure 3.9.
    https://file.army/i/B4GJM7o
    Figure 3.9 After the delta/patch is applied, slave will have identical archive to master.

Lastly on this subsection, for implementation should be targeted for regions with severe network connectivity. Although transmitting only the differential than the whole contents reduces the transmission cost, it is not the only answer regarding to network stability issue. Network stability issue can be a long cut off in the middle of transmission which forces to restart the synchronization process. Another one is short cut offs which makes the transmission discrete but unnecessary to restart, however frequent short cut offs can corrupt the transmission data. To solve this unstable network problem, techniques implemented in most download manager software applications should also be implemented on the synchronization's transmission. To support continueable download after the transmission is completely cutoff, is to split the transmission data into pieces. During cutoff, the transmission can be continued by detecting how much pieces the client has, then request and retrieve remaining pieces from server. To prevent data corruption checksums can be used to verify the data's integrity, on this case are the pieces integrity. Finally Figure 3.6 is modified to Figure 3.10.

https://file.army/i/B4GJQGE
Figure 3.10 Implementation of some download manager techniques into rsync algorithm based synchronization. Delta is split into pieces and retrieved by the client. The integrity of the pieces are checked using cheksum, here is MD5 and if inconsistent it will redownload those pieces. In the end the pieces are merged. This can also be implemented on uplink side when sending the signature.

3.3.3 Experiment Result and Evaluation

With dump and upload based synchronization prototype created, an experiment was conducted. The experiments took place on many LMS with the latest version, which were Moodle 3.3, Atutor 2.2.2, Chamilo 1.11.4, Dokeos 3.0, Efront 3.6.15.5, and Illias 5.2. The purpose was to compare the network traffic between full synchronization and incremental synchronization, and percentage of duplicate data eliminated. The experiment used the authors own original course contents which mainly consists three topics are computer programming, computer network, and penetration testing, with each consists of materials, discussion forums, assignments, and quizzes. A snapshot of one of the topics was provided on Figure Figure 3.3.

There are four scenarios. First is full synchronization, equivalent to transmitting the whole course content or full download from the client side. Second is large content incremental synchronization is when the client only have one of the three topics (example for Moodle will update from 16.5MB to 30.5MB). Third is medium content incremental synchronization is when the client already have two of the three topics (example for Moodle will update from 28.4MB to 30.5MB), and the client wants to synchronize to the server in order to have all three of the topics (update). Fourth is no revision meaning incremental synchronizing while there is no update, to test whether there are bugs in the software application which the desired result should be almost no network traffic generated. On Table 3.1 shows the course content data size in bytes when it has one, two, or three of the topics. The data sizes varies depending on the LMS, but the contents such as materials, discussion forums, assignments, and quizzes are almost exactly similar.

Table 3.1 Size of course contents of the same course on different LMS, showing sizes when in contains one, two, and three topics.
LMS 1 Topic 2 Topics 3 Topics
Moodle 16.5 MB 28.4 MB 30.5 MB
Atutor 336.5 kB 11.7 MB 13.7 MB
Chamilo 8.5 MB 20 MB 22 MB
Dokeos 27.4 MB 39 MB 41 MB
Efront 16.5 MB 28 MB 30 MB
Illias 439.3 kB 22.8 MB 26.6 MB

The experiment used rdiff utilities to perform rsync algorithm between latest and outdated as the incremental synchronization. Before proceeding it is wise to examine the affect of block size which on previous subsection states that users are free to define the size. The test was perform on Moodle's archives from Table Table 3.1 between an archive which has one topic of 16.5MB and archive which has 3 topics of 30.5MB. The result is on Figure 3.11 showing the relationship between block size, signature, and delta size, which affects total transmission cost by summing signature and delta. Larger block size meaning less blocks where less checksum sets are generated, thus smaller signature size. However this means less accurate checking and less likely to detect similar blocks which will contribute to the size of the delta. The Figure 3.11 showed the delta had reached the full size of the targeted archive, meaning that it missed detecting similar blocks, thus the whole archive is treated as totally different archive. The incremental synchronization will be more heavier than full synchronization. Reversely smaller block size provides more accurate detection which guarantee to reduce the size of the delta. However this means more blocks and more checksum sets are to be bundled into the signature, and looking at the Figure it can grow very large that can cost a lot more transmission cost then full synchronization itself. In conclusion choosing the right blocksize is crucial to get less sum of signature and delta that contributes to the transmission cost, on this case 512 bytes of block size is optimum.

https://file.army/i/B4GJXmU
Figure 3.11 Test result showing the relationship between block size, signature, and delta. When the block size increases the signature size decreases, but the opposite for delta which it increases. The full file is the size of the file to be downloaded without using differential method, in other words using full synchronization. The transmission cost if using incremental synchronization is the sum of signature and delta which on this case is when the block size is 512 bytes when it is optimal.

With the relationship of blocksize to signature and delta discussed, it is still not ready to proceed with the experiment. With the difference between the two archive's size, latest is 30.5MB and outdated is 16.5MB ideally the delta should be 14MB but still strayed far to as large as 20MB. It is found that the problem is because the rsync algorithm (rdiff) was executed directly on the archive which is still compressed. The solution is to uncompress the archive before hand and execute rdiff recursively of every available contents which makes the author to turn on more modified utility called rdiffdir.

The experiment succeeded and got results of Figure 3.12. Figure 3.12 already includes uplink and downlink, for incremental synchronization uplink is influenced by the size of the signature and downlink is influenced by the size of the delta (see Figure 3.6). Detailed data are also provided on Table 3.2. However the purpose of both Figure 3.12 and Table 3.2 is only to show that incremental synchronization is better than full synchronization which in this case is lower network traffic, and to show that the incremental synchronization is able to detect when there are no updates in this case almost no network traffic, while the main objective is to eliminate duplicate data during transmission.

https://file.army/i/B4GJAU3
Figure 3.12 Network traffic generated based on the four scenarios of the experiment. Full sychronization generates the most network traffic shown in blue bars. The orange and yellow bar is network traffic of incremental synchronization depending on the size of contents to be updated which lower are generated compared to full synchronization. The green bars showed incremental synchronization execution when there is no update and the results are very low and tolerable.
Table 3.2 Detail experiment result of Figure Figure 3.12 showing size of signature and delta during incremental synchronization scenarios on each LMSs.
Signature in Mega Bytes Delta in Mega Bytes
LMS Large Medium None Large Medium None
Moodle 0.5427 0.9668 1.1621 15.7489 2.9688 0.7227
Atutor 0.0292 0.3125 0.3711 13.5254 2.0899 0.0684
Chamilo 0.215 0.5427 0.6144 14.4282 2.6214 0.2048
Dokeos 1.307 1.6282 1.6794 15.0938 3.6535 0.9626
Efront 0.1024 0.1741 0.1946 13.6499 2.1402 0.0102
Illias 0.0025 0.1339 0.1559 26.2226 4.0107 0.0001
Average 0.3671 0.6264 0.6962 16.4431 2.9141 0.3281

The percentage of redundant data eliminated is shown on Table 3.3 for incremental synchronization scenarios. It is assumed that the ideal delta is the difference in data size between the latest and outdated archive. The duplicate data is the outdated archive itself or the latest archive substracted by the ideal delta, which is this much that had to be eliminated. The larger the experiment's delta size compared to the ideal delta, the worse the experiment's result. With these results the performance of the incremental synchronization can be evaluated by calculating the percentage of duplicated data eliminated which is the full latest archive substracted by experiment's delta size, next divided by duplicated data, and then converted to percentage. For large content synchronization there is one LMS Atutor which had a low result of 51.89 % due to size of generated archive itself (Table Table 3.1) and drop the whole average to 85.30%. Other than Atutor and Illias the duplicate data eliminated percentage is above 89%. For the medium content synchronization a very high average duplicate data eliminated percentage is achieve which is 97.90%, meaning duplicate data are almost completely eliminated. Though these results are obtain strictly under optimal block size configuration (Figure Figure 3.11) where the minimum network traffic consisted of uplink and downlink (affected by signature and delta size) is desired. There is no benefit of 100% duplicate data elimination if the uplink (signature size) is very large.

Table 3.3 Experiment result of delta size compared to ideal size, and percentage of duplicate eliminated was formulated from these data.
In Mega Bytes Large Content Synchronization Medium Content Synchronization
LMS Full Result Ideal Eliminated Result Ideal Eliminated
Moodle 30.5 15.7389 14 89.46% 2.9688 2.1 96.94%
Atutor 13.7 13.5254 13.3635 51.89% 2.0899 2 99.23%
Chamilo 22 14.4282 13.5 89.08% 2.6214 2 96.89%
Dokeos 41 15.0938 13.6 95.55% 3.6535 2 95.76%
Efront 30 13.6499 13.5 99.09% 2.1402 2 99.50%
Illias 26.6 26.2226 26.1697 87.71% 4.0106 3.8 99.08%
Average 27.3 16.4431 15.6889 85.30% 2.9141 2.3167 97.90%

3.3.4 Advantage of Dump and Upload Based Synchronization

With the dump and upload based incremental synchronization model successfully able to eliminate very large amount of duplicate data the advantage compared to the previous dynamic content synchronization can be discussed:

  1. Since the model utilizes existing utilities mainly the export and import feature in LMSs one software application can be compatible to all LMS and all of its versions as long as it has this feature. The reason is because the export and import feature is guaranteedly maintain by the LMSs' developers, unlike dynamic content sychronization software application, there is no need to worry about structure changes on LMS. The advantage is actually on the developer side, when writing dynamic content synchronization software application the writer needs to coordinate the database and directories while for dump and upload based synchronziation it is already taken care of by the LMSs' developers.
  2. Other benefits can also be obtained from the export and import feature however relative to the LMS. For example on Moodle it has the capability to choose whether to include private data or not, meaning for synchronization it can have a flexible privacy option. While for other LMS private data is filtered out which means no other options other than retaining the privacy for synchronization. Another example also on Moodle it is able split a course into smaller blocks of learning contents and able to dump specific learning contents (not all). The synchronization software application can be tuned for partial synchronization, meaning other teachers can get only parts that they are interested in. Unfortunately this is available only on Moodle, other LMS have to dump the whole course contents.
  3. Since the method is dumping, it can easily be tuned for bidirectional synchronization, unlike dynamic content synchronization which is unidirectional. The incremental synchronization uses the pull concept where the requesting server only asked the difference from targeted server, while push concept is usually unidirectional where the master forcefully updates the slaves. Although dynamic content synchronization is claimed to be unidirectional, the author sees that it is possible to modify the software application to bidirectional because the differential synchronization method is general, however it is uknown whether it will be as easy to modify as the dump and upload base synchronization.

4 Conclusion and Future Work

4.1 Conclusion

Portable and synchronized distributed LMS was introduced to keep the contents up to date in environment of severe network connectivity. By replacing the servers with hand carry servers, the servers in severed network regions were able to move to find network connectivity for synchronization. The hand carry server was proved to be very portable because of its very small size and very light weight. The power consumption is very low that a power bank used on smart phone is enough to run the hand carry server for almost a whole day. Though very convenient however it has resource limitations mainly on CPU and memory, which limits the number of concurrent users. Still, the problem of unable to perform synchronization in no network connectivity area is solved.

The Incremental synchronization technique was beneficial for synchronization in distributed LMS, where it eliminates very large amount of duplicate data . Though in the past incremental synchronization was already proposed to be implemented in distributed LMS, this thesis provides a better approach which is dump and upload based synchronization. The advantages are that it is compatible to most LMSs and most of their versions, easily tuneable for bidirectional synchronization, and because it utilizes LMS features it can be tuned for example to configure privacy settings, and to perform partial synchronization.

4.2 Future Work

All of the experiment are done in the lab, and it is better to conduct real implementation in the future especially regarding the hand carry servers. A possible real implementation is to have drones carrying the hand carry servers. Performance issue is still a problem with hand carry servers that demands for enhancing techniques like integrating field programmable gate array (FPGA). For incremental synchronization it was discussed only the network issue but not yet resource such as CPU and memory. Although the synchronization on this thesis is bidirectional, distributed revision control system is needed to be implemented for larger collaborations. The distributed LMS here is a replicated system, but there is a better, more flexible trend to use especially for content sharing which is message oriented middleware (MOM) system that in the future is very interesting to be implemented.

Acknowledgement

I would like to give my outmost gratitude to the all mighty that created me and this world for his oportunity and permission to walk this path as a scholar and for all his hidden guidances.

The first person I would like to thank is my main supervisor Prof. Tsuyoshi Usagawa for giving me this topic, also to Dr. Royyana who was researching on this topic before me, and their countless wise advices for perfecting this research. The professor is also the one who gave me this oportunity to enroll in this Master's program in Graduate School of Science and Technology, Kumamoto University. It was also through his recommendation that I received the Ministry of Education, Culture, Sports, Science and Technology (MEXT) scholarship from Japan. Not to forget his invitation to join his laboratory, the facilities, and comfort that he had provided. Also, I would like to thank all the oportunities that he had given me to join many conferences such as in Tokyo, Myanmmar, and Hongkong.

Then I would like to thank the Japanese government for giving me this MEXT scholarship that I never have to worry about financial. Instead I can focus on my studies, research, planning my goals for the future, and helping other people. I also would like to thank my other supervisors Prof. Kenichi Sugitani and Prof. Kohichi Ogata for evaluating my research and my thesis.

Next I would like to thank my parents, family and my previous University Udayana University, for not only raising and allowing me, but also pushed me to continue my studies. I would to thank my project team Hendarmawan and Muhammad Bagus Andra that our work about hand carry servers contributes in forming this thesis. My project team also my friends in laboratory Alvin Fungai, Elphas Lisalitsa, Irwansyah, Raphael Masson, and Chen Zheng Yang who were mostly on my side and even contributes to some degree on all my research. Like my friends in previous University whom now walk our separate ways, often spent the night together in laboratory, are friends whom I can trust with my life.

I would to like thank the Indonesia Community, Japanese friends, and other international friends who helped me with life here for example finding an apartment for me, but mostly their friendliness. Lastly to all others that helped me whom I cannot mention one by one, whether the known or the uknown, and whether the seen and the unseen. To all these people, I hope we can continue to work together in the future.

Reference

  1. M. Kelly, “openclipart-libreoffice,” (2017), [computer software] Available: http://www.openclipart.org. [Accessed 27 June 2017].
  2. S. Paturusi, Y. Chisaki, and T. Usagawa, “Assessing lecturers and students readiness for e-learning: A preliminary study at national university in north sulawesi indonesia,”GSTF Journal on Education (JEd), vol. 2, no. 2, pp. 18, (2015), doi: 10.5176/2345-7163_2.2.50
  3. Monmon. T, Thanda. W, May. Z. O, and T. Usagawa, “Students E-readiness for E-learning at Two Major Technological Universities in Myanmar,” In Seventh International Conference on Science and Engineering, pp. 299-303, (2016), Yangon, Myanmar.
  4. O. Sukhbaatar, L. Choimaa, and T. Usagawa, “Evaluation of Students’ e-Learning Readiness in National University of Mongolia, ” Educational Technology (ET) Technical Report on Colloborative Support, etc., pp. 37-40 (2017). Soka University:Institute of Electronics, Information and Communication Engineers (IEICE).
  5. E. Randall, “Mongolian Teen Aces an MIT Online Course, Then Gets Into MIT,” [online] Available: http://www.bostonmagazine.com/news/blog/2013/09/13/mongolian-teen-aces-mit-online-course-gets-mit. [Accessed 27 June 2017].
  6. N. S. A. M. Kusumo, F. B. Kurniawan, and N. I. Putri, “Learning obstacle faced by indonesian students,” in The Eighth International Conference on eLearning for Knowledge-Based Society, Thailand, Feb. (2012), [online] Available: http://elearningap.com/eLAP2011/Proceedings/paper25.pdf. [Accessed 27 June 2017].
  7. Miniwatts Marketing Group, “Internet World Stats Usage and Population Statistics,” [online] Available: http://www.internetworldstats.com/stats.htm. [Accessed 27 June 2017].
  8. Q. Li, R. W. H. Lau, T. K. Shih, and F. W. B. Li, “Technology supports fordistributed and collaborative learning over the internet,” ACM Transactions onInternet Technology (TOIT) Journal, vol. 8, issue 2, no. 5, pp, (2008).
  9. F. Purnama, and T. Usagawa, “Incremental Synchronization Implementation on Survey using Hand Carry Server Raspberry Pi”,Educational Technology (ET)Technical Report on Colloborative Support, etc., pp. 21-24 (2017). Soka University: Institute of Electronics, Information and Communication Engineers (IEICE), doi: 10.1145/1323651.1323656.
  10. F. Purnama, M. Andra, Hendarmawan, T. Usagawa, and M. Iida, “Hand Carry Data Collecting Through Questionnaire and Quiz Alike Using Mini-computer Raspberry Pi”,International Mobile Learning Festival (IMLF), pp. 18-32 (2017), [online] Available: http://imlf.mobi/publications/IMLF2017Proceedings.pdf. [Accessed 27 June 2017].
  11. R. M. Ijtihadie, B. C. Hidayanto, A. Affandi, Y. Chisaki, and T. Usagawa, “Dynamic content synchronization between learning management systems over limited bandwidth network,” Human-centric Computing and Information Sciences, vol. 2,no. 1, pp. 117, (2012), doi: 10.1186/2192-1962-2-17
  12. F. Purnama, T. Usagawa, R. Ijtihadie, and Linawati, “Rsync and Rdiff imple-mentation on Moodle’s backup and restore feature for course synchronization overthe network”,IEEE Region 10 Symposium (TENSYMP), pp. 24-29 (2016). Bali:IEEE, doi: 10.1109/TENCONSpring.2016.7519372.
  13. The World Bank Group. Mobile cellular subscriptions (per 100 people). (2017,March 06). Retrieved from http://data.worldbank.org/indicator/IT.CEL.SETS.P2.
  14. R. M. Ijtihadie, Y. Chisaki, T. Usagawa, B. C. Hidayanto, and A. Affandi, “E-mail Based Updates Delivery in Unidirectional Content Synchronization among Learning Management Systems Over Limited Bandwidth Environment, ”IEEE Re-gion 10 Conference (TENCON), pp. 211215, (2011), doi: 10.1109/TENCON.2011.6129094.
  15. R. M. Ijtihadie, Y. Chisaki, T. Usagawa, B. C. Hidayanto, and A. Affandi, “Offline web application and quiz synchronization for e-learning activity for mobile browser” 2010 IEEE Region 10 Conference (TENCON), pp. 2402-2405, (2010), doi: 10.1109/TENCON.2010.5685899.
  16. M. Cooch, H. Foster, and E. Costello, “Our mooc with moodle," Position papers for European cooperation on MOOCs, EADTU, (2015).
  17. J. W. Hunt, and M. D. McIlroy, “An algorithm for differential file comparison,” Computing Science Technical Report, (1976). New Jersey: Bell Laboratories, [online] Available: https://www.cs.dartmouth.edu/~doug/diff.pdf. [Accessed 27 June 2017].
  18. T. Usagawa, A. Affandi, B. C. Hidayanto, M. Rumbayan, T. Ishimura, and Y.Chisaki, “Dynamic synchronization of learning contents among distributed moodle systems,” JSET, pp 1011-1012, (2009).
  19. T. Usagawa, M. Yamaguchi, Y. Chisaki, R. M. Ijtihadie, and A. Affandi, “Dynamic synchronization of learning contents of distributed learning management systems over band limited network contents sharing between distributed moodle 2.0 series," in International Conference on Information Technology Based Higher Education and Training (ITHET), (2013). Antalya, doi: 10.1109/ITHET.2013.6671058
  20. A. Tridgell and P. Mackerras, “The rsync algorithm," The Australian National University, Canberra ACT 0200, Australia, Tech. Rep. TR-CS-96-05, (1996), [online] Available: https://openresearch-repository.anu.edu.au/handle/1885/40765. [Accessed 27 June 2017].

Mirror

Author

Fajar Purnama

Note

  • This is a dissertation submitted to Graduate School of Science and Technology, Computer Science and Electrical Engineering in Kumamoto University, Japan, on September 2020 in partial fulfillment of the requirements for the degree of Doctor of Philosophy but was not published thus the copyright remained with me "Fajar Purnama" the main author where I have the authority to repost anywhere and I claimed full responsibility detached from Kumamoto University. Except for contents marked with copyright (©), I hereby declare to license it as customized CC-BY-SA where you are also allowed to sell my contents but with a condition that you must mention that the free and open version is available here. In summary, the mention must contain the keyword "free" and "open" and the location such as the link to this content.
  • The presentation is available at Slide Share.
  • The source code is available at Github.

Declaration of Authorship

I, Fajar PURNAMA , declare that this thesis titled, “Development of a Lossy Online Mouse Tracking Method for Capturing User Interaction with Web Browser Content” and the work presented in it are my own. This thesis is based on few of my publications and I hereby confirmed that I have permission to reuse them:

  • For my journal paper titled "Implementation of real-time online mouse tracking on overseas quiz session" (Purnama et al., 2020b), the copyright was transferred to Springer Science+Business Media, LLC, part of Springer Nature but the authors and I have been granted full permission to reuse the accepted version of the journal paper.
  • For my journal paper titled "Using real-time online preprocessed mouse tracking for lower storage and transmission costs" (Purnama and Usagawa, 2020), is open access under creative commons (CC-BY) where anyone can reuse the whole material.
  • For my proceeding paper titled "Rsync and Rdiff implementation on Moodle’s backup and restore feature for course synchronization over the network" (Purnama, Usagawa, et al. 2016), the copyright was transferred to IEEE but the authors and I does not need formal permission to reuse the accepted version of the proceeding paper.
  • For my technical report titled "Incremental Synchronization Implementation on Survey using Hand Carry Server Raspberry Pi" (PURNAMA and USAGAWA 2017), the copyright was transferred to IEICE but the authors and I have been granted full permission to reuse the published version of the report paper (IEICE, 2015).
  • For my proceeding paper titled "Demonstration on Extending The Pageview Feature to Page Section Based: Towards Identifying Reading Patterns of Users" (Purnama, Fungai, and Usagawa 2016), the copyright was not transferred, thus the copyright remains with the authors.
  • More detailed information are available in Appendix B.

Abstract

Though people are confined inside their houses due to COVID-19, they are forced to continue their activities online. The demand for tools to monitor these activities increases for example, making sure students reads materials, and examiners does not cheat during online examinations. Unfortunately, conventional web logs cannot monitor those kinds of activities. One monitor tool is mouse tracking that tracks the actions of the mouse cursor that includes clicks, movements, and scrolls, which covers the majority of online users’ interaction to the browser contents. Though mouse tracking is promising, very few implemented this tool because (1) previous mouse tracking tools requires desktop installations which is bothersome to the users and (2) the rumors that mouse tracking generates big data such as the saying a swipe from left to right generates a megabyte of data. This thesis tackles those problem by building a mouse tracking server application that is easily installable and does not require users to install any additional applications other than the web browser. The application was implemented in an overseas quiz session between National University of Mongolia and Kumamoto University where the amount of data generated was also investigated. This thesis also contributes to a lossy online mouse tracking method that can greatly reduce the amount of data generated. Finally, some visualization of the mouse tracking data are shown and possible application such as online examination cheating prevention and force reading of term of service are discussed.

Acknowledgements

My first gratitude would be to my supervisor Prof. Tsuyoshi Usagawa for taking care of me for five years starting from my Master’s program until the end of my Doctoral program. His deeds are almost immeasurable because without him, Kumamoto University, and The Ministry of Education, Culture, Sports, Science and Technology Japan, my currently best five years of my life may not be possible. I would like to thank my reviewers Prof. Kohichi Ogata, Prof. Kenichi Sugitani, Prof. Masahiko Nishimoto, and Prof. Masayoshi Aritsugi for their time in reviewing this thesis. I greatly thank my friend Alvin Fungai as the co-founder for this topic where without him, the topic of this thesis would have been different and I may be late in finishing this thesis because I have tried doing other topics and found to be much more difficult or just does not suit me. The critical development phase of this research was thanks to the Computer Algorithm class by Prof. Masayoshi Aritsugi and all of the participating members that included Hendarmawan, Hamidullah Sokout, Alhafiz Akbar Maulana, and Sari Dewi where in those moments that I decided this topic as my Doctoral thesis. The implementation and data were thanks to Dr. Otgontsetseg Sukhbaatar, Prof. Lodoiravsal Choimaa, and the students in School of Engineering and Applied Sciences, National University of Mongolia where without them, this topic may not make it to two international journal publication and may prevent the completion of this thesis. Lastly, I would like to thank my mother Linawati, father Teddy Junianto, and Ni Nyoman Sri Indrawati for their daily support.

Table of Contents

Declaration of Authorship

Abstract

Acknowledgements

  1. Introduction
    1. Background
    2. Problem
    3. Objective
    4. Hypothesis
    5. Contribution
    6. Benefit and Significance
    7. Thesis Structure
  2. Online Mouse Tracking Implementation and Investigation
    1. System Overview
      1. Mouse Tracking in Web Development
      2. Online Mouse Tracking System
      3. Privacy Policies
    2. Network Data Transmitted by One Click
      1. Peer to Peer Experiment
      2. Data Generation Estimation for Implementation Plan
    3. Overseas Online Mouse Tracking Implementation
      1. quizdetails
      2. Amount of Data Generated
  3. Online Mouse Tracking Resource Saving Methods
    1. Existing Methods
    2. Real-Time Online Mouse Tracking
    3. Lossy Online Mouse Tracking
      1. Three Mouse Tracking Preprocessing and Transmission Method
      2. Three Mouse Tracking Preprocessing and Transmission Simulations
      3. Three Mouse Tracking Preprocessing and Transmission Results
      4. Synchronization for Hand Carry Server Quiz
  4. The Depth Levels of Logs
    1. Web page / Course Content Level Logs
      1. Conventional Web Logs and Educational Data
      2. Amount of Interactions
      3. Web Page or Course Content Inactivity
    2. Area Level Logs
  5. Conclusion and Future Work
    1. Conclusion
    2. Future Work

A Data

B Copyrights

References

List of Figures

  1. In Chapter 1:
  2. In Chapter 2:
    1. Mouse Tracking Illustration.
    2. DOM representation of Table 2.1 (Purnama and Usagawa, 2020). The html tag is the parent with head, body, and footer tag as the children. Head has a child tag title, body has a child tag p, and footer has a child tag p.
    3. Mouse Tracking Chrome extension.
    4. Mouse Tracking Plugin on Moodle.
    5. Online Mouse Tracking Framework.
    6. Moodle Plugin Install.
    7. Privacy Policy.
    8. P2P real-time mouse tracking experiment.
    9. A plot of data rate generated by a user based on the events generated per second ©(Purnama et al., 2020b). The horizontal axis represents the events per second or frequency in hertz (Hz) and the vertical axis represents the data rate in kilobytes per second. The different colored lines represent the number of variables included (refer to Table 2.2).
    10. Overseas real-time online mouse tracking implementation.
    11. Moodle Log.
    12. Moodle Grade.
    13. Screenshot of mouse tracking data of students from National University of Mongolia who attempted a quiz session on a Moodle server at Kumamoto University ©(Purnama et al., 2020b).
    14. Total query/rows/events generated by each students during mouse tracking implementation between National University of Mongolia and Kumamoto University and its estimated total data transmission size ©(Purnama et al., 2020b). The horizontal axis represents individual students, primary vertical axis is the query/rows/events, and secondary vertical axis is the estimated data transmission size.
  3. In Chapter 3:
    1. Data rate during mouse tracking implementation between National University of Mongolia and Kumamoto University. The horizontal axis represents 10 minute interval time and the vertical axis represents the data rate in kilobytes per second. The yellow horizontal line shows the average and the vertical lines shows the minimum and maximum during their respective interval ©(Purnama et al., 2020b).
    2. Flowchart of mouse tracking ©(Purnama et al., 2020b): offline (left), online (middle), real-time and online (right).
    3. Illustration of bottleneck network in regular online mouse tracking and real-time online mouse tracking as a solution ©(Purnama et al., 2020b).
    4. Whole page vs region of interest vs default mouse tracking illustration. The left scroll illustrates summarized event amount that summarizes the number of events occurring on the whole page; the middle scroll illustrates ROI tracking that summarizes the number of events occurring in defined areas, and the right scroll illustrates default mouse tracking that records every event and the precise point where it occurs, forming a trajectory.
    5. Three Types of Mouse Tracking Flowchart. The left flowchart is default mouse tracking, the middle flowchart is summarized event amount, and the right flowchart is region of interest mouse tracking (Purnama and Usagawa, 2020).
    6. In Purnama and Usagawa, 2020 the simulation is based on Figure 2.10. In this thesis, the server is changed to single board computer Raspberry Pi 3. The reason is to support regions with limited connectivity in Figure 3.7.
    7. Even though the ownership of computer and mobile devices increase drastically, the pace of Internet penetration may not be as fast. Those who are in limited connectivity region may not be able to enjoy online quizzes, let alone mouse tracking. Therefore Purnama et al., 2017 offers a hand carry server solution where the students’ computer devices can connect to the teachers’ single board computer server that runs quiz and mouse and touch tracking.
    8. The total script running time of three mouse tracking demo session by the author. The horizontal axis is the mouse tracking method. The data in order are from Mozila Firefox, Microsoft Edge, and Google Chrome. The vertical axis is the total running time in milliseconds. Among the three browsers Mozilla Firefox performs faster than Microsoft Edge and Internet Explore performs faster than Google Chrome for this work u, 2020.
    9. CPU and RAM usage and data rate comparison between default mouse tracking, summarized event amount, and ROI mouse tracking.
    10. Suppose there are two quiz sessions like the one in this thesis. The teacher have to synchronize the data two times which are after the first session and after the second session. Although the human mind knows that it is better to update, the computer today still does not operate that way. Even the default copying in most people desktop still functions as copying the whole data and replacing the old shown on the left. Today, a separate application must be used to perform incremental synchronization shown on the right that is able to calculate the difference between the old and new data ©(PURNAMA and USAGAWA, 2017).
    11. A detailed illustration of the rsync algorithm procedure where the steps in summary are splitting the data into blocks, scan for blocks relocation, and scan for blocks that does not exist where they can be to be newly added blocks or unused blocks to be deleted. Finally, execute relocation, addition, and deletion based on the obtained information from the scanning (Purnama, 2017)
  4. In Chapter 4:
    1. Six level of web logs in order from most shallow to deepest are Internet, websites, categories, web pages, area, and coordinates. .
    2. Six level of educational data in order from most shallow to deepest are Internet, academies, courses, course contents, area, and coordinates.
    3. Web Log vs Eye Tracking.
    4. Inactive Query Time Domain.
    5. An exam detector that tracks unwanted activities of participants such as mouse leaving the exam, tab and meta button to leave the exam, and other events indicating exam leaving.
    6. Mouse Tracking Heatmap.
    7. Mouse activity heatmap in quiz page locations in time series. The horizontal axis represents 10 minute interval time and vertical axis are quiz page locations. For the heatmap, green color is close to minumum activity, yellow color is close second quartile, and red is close to maximum activity.
    8. Mouse activity heatmap in quiz page locations of each students. The horizontal axis are quiz page locations and vertical axis are the students anonymized. For the heatmap, green color is close to minumum activity, yellow color is close second quartile, and red is close to maximum activity.
    9. Grade Heatmap.
    10. Illustration of force reading based on the duration of the mouse cursor stays in an area. The left example shows that the mouse cursor did not stay long enough in each area and tells the user to read everything, the middle example shows that the mouse cursor did not stay long enough in middle area and tells the user to complete reading middle area, and the right example shows satisfaction in user’s reading.
    11. Left Click Visualization.

List of Tables

  1. In Chapter 1:
  2. In Chapter 2:
    1. A web page code in simple HTML that contains html, head, title, body, p, and footer tags (Purnama and Usagawa, 2020)
    2. The data generated of one click posted to the server ©(Purnama et al., 2020b). The rows before the last row are the types of information, and the last row shows the data rate of the submitted post (Purnama et al., 2020a).
    3. Comparison of mouse tracking data size to daily pageview (monthlypageview), Moodle log and grades, Nasa server log 1995 (nasadata1995), Open University learning analytics dataset (openuniversitydata), and HarvardX Person-Course 2013 (DVN/26147_2014) ©(Purnama et al., 2020b).
  3. In Chapter 3:
    1. Comparison of data amount generated from the three types of mouse tracking.
  4. In Chapter 4:
    1. The duration and event amount generated by 41 Mongolian students during a quiz session.

1 Introduction

1.1 Background

Thanks to the development of information communication technology (ICT), humanity lives in convenience. It is no longer necessary to spend much effort to seek information. Whereas in the past, people needs to travel to libraries to seek books, buy newspapers to get the latest news, gather in a community to hear the latest rumors, or even start a pilgrimage to find a master. Nowadays, most information are available in the Internet. With ownerships of portable computer devices that can connect to the Internet from anywhere becoming mainstream, anyone can search for their desired information (Dentzel, 2013).

The Internet is not only an open massive source of information where anyone can publish, but also a tool for distant activities. People can interact with each other without meeting through text, voice, or video messages regardless the time and place. More people do not go to shop but order items through online shopping. In some countries like Indonesia, they develop an application that can order variety of services online (Azzuhri et al., 2018) such as meal delivery service, calling house cleaners, calling therapist, etc.

Due to the recent COVID-19 pandemic that occurred early February 2020, most regions are in a lockdown where people are to stay away from each other (mostly asked to stay at home) to prevent the spread of infection. Even school closes, most governments around the world have temporarily closed educational institutions in an attempt to contain the spread of the COVID-19 pandemic (UNESCO, 2020). All forms of activities are recommended to be done online which includes educational activities where courses are switched from face to face to online. The basic of online course that is known today is materials provided online, online text discussion forum, a feature to submit assignments online, online quiz session, (Linawati, Wirastuti, and Sukadarmika, 2017) and the features to analyze and evaluate students' performance. For interactivity, people prefer to join live streaming videos, webinars, online game sessions, interactive online programming, etc.

Unfortunately, conventional web analytic does not measure up to how teachers examine or analyze students during face to face private tutors. Teachers normally able to examine students' attention, emotion, and motivation during studying in real-time, but conventional web analytic does not provide such features for online education. This reason is especially true for a very crucial educational activity which is examination. Security is very tight for face to face examinations to prevent dishonest behavior but this is not true for online examinations today. This is why most educational institute implements blended (Paturusi, Chisaki, and Usagawa, 2012) learning which is a mixed face to face course and online course than implementing full online course. This applies to anything online, not only with education, for example during shopping, shop owners are able to identify the interest of their customers face to face and act accordingly. The simplest example people can see whether someone is skimming or pay close attention during reading when face to face. In online reading, people normally cannot know whether the viewer is actually reading the materials or not. An example crucial demand is reading detection of agreements or terms of services. Most people scrolls down and accept the terms of services without actually reading them.

The lack of data for online analytic can actually be solved by eye tracking, mouse tracking, and all other online monitoring techniques in real-time. Although these techniques were introduced in the early 20th century, they are still rarely implemented. One of the main reasons is the huge data generated by these techniques which is too much for most administrators and analyzers to handle (Leiva and Huang, 2015). This connects to the next reason that the previous applications only suit academia and does not suit wide implementation. For eye tracking is that the hardware are intrusive where users usually have to wear googles. Though non-intrusive ones exists but they are most likely expensive. Mouse tracking are non-intrusive and no cost because in default they are available in every computer where no additional hardware is needed. However, the previous application are only suited in laboratory where they are installed offline in each computer and not online. This thesis tackles that problem.

1.2 Problem

  1. There are almost no application to monitor crucial online activities such examinations.
  2. Although there are rumors of huge data generated by mouse tracking, there are almost no facts and investigations.
  3. The rumors already discourage mouse tracking application development for public development and today's most mouse tracking application are only suitable for academia and laboratories.
  4. The huge data generated are inline to the resource required for implementation, thus methods for reducing data generation are necessary.

1.3 Objective

  1. Create an online mouse tracking application that is easily implementable.
  2. Investigate the data generated and resource usage of the online mouse tracking application.
  3. Implement methods to reduce the data generated and resource usage of the online mouse tracking application.
  4. Use mouse tracking data to capture users' interaction with the web browser content and design a monitoring tool for crucial online activities which are examinations and passage reading.

1.4 Hypothesis

This thesis proposes a new preprocessing based on demand method specifically for online mouse tracking. It is a method that allows the implementer to determine the data they need before implementation. Amongst those data, the geometrical data (x and y mouse coordinates) are the largest one generated. Most of the time, implementer do not need all the data. Therefore, the data generation along with resource usage can be reduced if they choose the region of interest beforehand. In summary, by summarizing the coordinates into areas, the data generated can be reduced which will also reduce the resource usage.

1.5 Contribution

  1. Created an open source real-time online mouse tracking application that can be implemented on any website and browser.
  2. Investigated the data generated and resource usage of the real-time online mouse tracking application.
  3. A novel preprocessing based on demand method specifically for mouse tracking that reduces the data generation and resource usage.
  4. Implemented the mouse tracking application online and obtained mouse tracking data.
  5. Visualized the mouse tracking data and derive information which are usually underivable from conventional web logs and educational data.
  6. Designed a possible software implementation for monitoring online reading and examination.

1.6 Benefit and Significance

  1. Mouse tracking is one of the missing key for anything that are implemented fully online.
  2. Anyone can benefit the open source real-time online mouse tracking application in this thesis to implement or further develop online mouse tracking.
  3. The mouse tracking data generation and resource usage investigation can help companies and other parties to plan before implementing online mouse tracking.
  4. The methods presented to reduce mouse tracking data generation and resource usage gives opportunity for people in limited connectivity area to utilize online mouse tracking.

1.7 Thesis Structure

Other than the introduction, this thesis contains four more chapters. The second chapter is online mouse tracking implementation and investigation where this chapter discusses the implementation of online mouse tracking in any website and browser, and the amount of data generated. The third chapter is online mouse tracking resource usage reduction methods where known methods, real-time implementation, and the novel method of preprocessing based on demand is discussed. The fourth chapter is the depth levels of web logs and educational data which emphasizes mouse tracking logs as deeper level data than conventional educational data logs. The last chapter is conclusion and future work.

2 Online Mouse Tracking Implementation and Investigation

2.1 System Overview

2.1.1 Mouse Tracking in Web Development

Mouse tracking is a method to record the mouse activities of the users. Mainly, it records the clicks, movements, and scrolls location illustrated in Figure 2.1. Mouse tracking can be developed on the desktop level or the application level. On the desktop level, mouse tracking tracks every mouse activities that occur on the desktop, while on the application level, mouse tracking only tracks activities in the application and will stop tracking when the mouse cursor leaves the application. In this thesis case, the application is the web where it is related to the browsers and websites. There are many programming languages such as C to develop desktop mouse tracking while to develop web based mouse tracking, web programming language such as JavaScript (JS) is used.

https://file.army/i/B4Mm2Pp https://file.army/i/B4MmcZK
Figure 2.1 Mouse tracking illustration where the top image is from personal computer (PC) and bottom image is from smartphone tablet © (Purnama et al., 2020b). Both the image shows geometrical data (x and y coordinate) of the occurred mouse, scroll, and touch event, also which mouse button is clicked, how much percentage of zoom applied, and whether a keyboard is pressed.

The core of mouse tracking in web development is Domain Object Model (DOM) which is an Application Programming Interface (API) for Hypertext Markup Language (HTML) and Cross Markup Language (XML). It defines the logical structure of documents and the way a document is accessed and manipulated. Supposed a simple HTML page with the codes on Table 2.1, the DOM structure can be represented on Figure 2.2. With the Document Object Model, programmers can build documents, navigate their structure, and add, modify, or delete elements and content. Anything found in an HTML or XML document can be accessed, changed, deleted, or added using the Document Object Model, with a few exceptions. DOM is designed to be used with any programming language. Currently, it provides language bindings for Java and ECMAScript (an industry-standard scripting language based on JS and JScript) (Wood et al., 1998).

Table 2.1 A web page code in simple HTML that contains html, head, title, body, p, and footer tags (Purnama and Usagawa, 2020)

		<html>
		<head>
		<title>Simple Webpage
		
		</title>
		</head>
		<body>
		<p>Hello World!
		
		</p>
		</body>
		<footer>
		<p>CC
		
		</p>
		</footer>
		</html>
	
https://file.army/i/B4Mm5FD
Figure 2.2 DOM representation of Table 2.1 (Purnama and Usagawa, 2020). The html tag is the parent with head, body, and footer tag as the children. Head has a child tag title, body has a child tag p, and footer has a child tag p.

The implementation of mouse tracking is based on DOM events, specifically mouse, touch, and User Interface (UI) events which are actions that occur as a result of the user's mouse actions or as a result of state change of the user interface or elements of a DOM tree (Pixley et al., 2000). In this thesis jQuery is used to access the DOM API and receive information that are related to mouse, touch, and UI events. The following list shows the mouse events utilized in this thesis:

  • Mousedown: when either one of the mouse buttons are pressed (usually left, middle, or right button)
  • Mouseup: when either pressed mouse buttons are released
  • Mousemove: when the mouse cursor moves
  • Mouseleave: when the mouse leaves an element (we only indicate when temporary leaving a webpage)
  • Mouseenter: when the mouse enters an element (we only indicate when temporary entering a webpage)
  • Scroll: when the webpage scrolls
  • Touchstart: when a computer device screen is touching
  • Touchend: when a touch from touchstart is removed
  • Touchmove: when a touch is moving
  • Touchcancel: when a touch is interrupted
  • Resize: when the webpage is zoomed in or out

There are many DOM events that are not implemented by the application in this thesis. However, they maybe implemented in the future if they are found to be useful. But for now, the following DOM events other than mouse events are worth considering and are implemented:

  • Beforeunload: when the webpage almost closes
  • Resize: when the webpage is zoomed in or out
  • keypress: when a keyboard is pressed
  • cut: when the user attempts to cut a content
  • copy: when the user attempts to copy a content
  • paste: when the user attempts to paste a content
  • dblclick: when a double click is performed
  • auxiliarymenu: when a right click menu is called

After implementing the DOM events, the information is processed by adding important labels. The first labels are time information such as the date of the received information and duration by calculating the difference between the current and previous received events. The second labels are the place information such as the category, page, post, course, course content, or if those information are not available then the default information is the Uniform Resource Locator (URL). More in-dept place information are the areas or sections of the page, and the deepest of them all are the coordinates of the page. The third label is the identity label if available and permitted such as the name, email address, ip address, and location of the user.

2.1.2 Online Mouse Tracking System

The author developed an online mouse tracking application implementable on any website where the code is open source on GitHub (Purnama, 2019). It is written in HTML, Cascading Style Sheets (CSS), JS, jQuery, and PHP. The mouse tracking code can either be implemented on client side shown on Figure 2.3 or server side shown on Figure 2.4. The difference is that the client side can capture anything including all the web page that the user visits while the server side can only capture the events that happen on the server's website.

https://file.army/i/B4MmNNQ
Figure 2.3 Mouse Tracking Chrome extension © (Purnama et al., 2020b). The mouse tracking extension is visible on the extension bar. The user can open the configuration window by clicking the icon and configuration. the events (clicks, moves, scrolls) to record.
https://file.army/i/B4MmgKY
Figure 2.4 Mouse Tracking Plugin on Moodle © (Purnama et al., 2020b). The figure shows examples of mouse tracking implemented as a block plugin (in blue) and theme plugin (in red).

Figure 2.5 shows a more detailed server side implementation. The mouse, touch, and UI DOM events in the previous subsection are written in JS and jQuery and are placed on the representation side which is the website along with the HTML and CSS. The order the online mouse tracking in Figure 2.5 are:

  1. The browser attempts to visit the website by requesting HTML, CSS, and JS. If the mouse tracking is written as a server application, then the code is in the JS section, otherwise it is directly installed on the client. The code is written in jQuery.
  2. The HTML, CSS, and JS are sent to the client.
  3. The browser renders the page by processing the HTML and CSS.
  4. JS and jQuery are often categorized as client side programming language. They run on the browser's background where in this case the mouse tracking is running on the background.
  5. What differentiates offline and online mouse tracking is the location of where the mouse tracking log is stored. Offline mouse tracking stores the logs on the client while online mouse tracking stores the logs on online server. When storing mouse tracking log online, the client side sends the log using Hypertext Transport Protocol (HTTP) post method.
  6. The server processes the received log usually using server side programming language such as PHP.
  7. The log can be stored as a file, in a database, or in any form of storage.
https://file.army/i/B4MmqQa
Figure 2.5 Online Mouse Tracking Framework © (Purnama et al., 2020b). The Framework is divided into two sides where one side is the client and the other side is the server. The client and the server are connected via the Internet. The server contains the front end, which is usually the representation side of the website, and back end where background processing and data storing occurs. There is a browser on the client equipped with client-side programming. The arrow presents the direction of the processes and the number presents the order of the processes.

For the client side does not require high performance hardware even Raspberry Pi works. As for the operating system, any operating system will do for as long as it has a browser that can run JavaScript. The application developed in this thesis supports both offline and online log storage and can either be in form of a file or stored in a database application. For the client application, this thesis provides a browser extension. Although that it requires installation in each client, all browser activities including visiting other websites are tracked. The author bundled the mouse tracking browser extension codes to make the installation easier where the client only needed to download and install.

For the server application, the advantage is that client does not need to install additional application, just browse the website and mouse tracking runs automatically but the disadvantage is that it cannot track outside of the website however it can still tell whether users' are leaving the page or not. For the server hardware depends on the amount of users that the administrators want to handle and as for the hardware specification used in this thesis is discussed on the next section. For the software, a standard web server is enough such as a server equipped with Apache2, PHP, and MySQL. For the installation, the author made it easy that all that are needed are to download the codes and install. In this thesis, the mouse tracking server application was implemented on an Learning Management System (LMS) called Moodle which is used to handle online courses. The mouse tracking codes are rearranged as a Moodle plugin where the author made a block and theme plugin for the Moodle shown on Figure 2.4. For usage, online choose one form of the plugin, either block or theme. The installation is also easy shown on Figure 2.6 where the process are only download, upload the plugin to Moodle, and install.

https://file.army/i/B4MmtbL
Figure 2.6 Screenshot and illustration of installing mouse tracking Moodle plugin. The page shows Moodle plugin installation page and the .zip image symbol represents the mouse tracking Moodle plugin in .zip format.

2.1.3 Privacy Policies

Privacy policies should be disclosed to the users during any form of data gathering. In the European Union (EU) is more strict that cookie policies should be separated from the privacy policies. By disclosing privacy policies, not only being in compliance with the laws and regulations, but build trusts with the users as well (PrivacyPolicies.com, 2020).

Based on how mouse tracking is executed which more details are illustrated in Figure 2.5, users actually have full control over the mouse tracking process and they can stop the process anytime but they are usually unaware because the mouse tracking runs in the background. They would have to thoroughly inspect the background area to see the running mouse tracking and most users do not attempt to perform this task because they do not feel bothered by the process. This is the reason why mouse tracking is considered non-intrusive.

Another reason why most users do not attempt to inspect the background area to see the running mouse tracking is because this requires technical skills that most users do not possess. Therefore, they are usually not aware that their data regarding to mouse, touch, and UI activities are recorded. To be in compliance with the privacy policy in general public websites, mouse tracking data gathered should be disclosed. The method is to pop up a mouse tracking configuration menu and before that, a notification menu asking permission to the user whether they allow the recording or not illustrated on Figure 2.7. If they allow then the options on the configuration menu should be marked and if they do not allow then the the options should be unmarked and no mouse tracking runs. In the educational sector depends on the academy/college/school/university and the lecturer/professor/teacher. Most of the time, the students are forced into compliance in having their activities recorded because of the demand to handle crucial educational activities such as preventing dishonest behaviors during exams.

https://file.army/i/B4MmP79
Figure 2.7 The left image shows public privacy policy compliance illustration and the right image shows an agreement example between student and teacher about the recording of mouse tracking data for crucial educational activities.

2.2 Network Data Transmitted by One Click

Leiva and Huang, 2015 stated that a mouse swipe from left to right can generate hundreds of cursor coordinates and a mouse activity over a minute can generate 1 MB (megabyte) of data. Huang, White, and Dumais, 2011 conducted a massive scale mouse tracking on Microsoft’s Bing search engine but in the middle of the experiment, they have to reduce the sampling rate because the data size was simply too much. Those two references are the only scientific record found that complains about the problem of huge data generated by mouse tracking. This shows that data generated and the resource usage are not officially investigated. Therefore, an implementation followed by investigation were conducted by Purnama et al., 2020b.

2.2.1 Peer to Peer Experiment

The one click Peer-to-Peer (P2P) experiment is an experiment that measures the amount of data transmitted from the client to server when the user performs one click shown on Figure 2.8. This experiment greatly helps the investigation because the result can be used to predict the data cost mathematically. However, the result is dependent on the application, as time passes people may find ways to reduce the data.

https://file.army/i/B4MmUlo
Figure 2.8 P2P real-time mouse tracking experiment © (Purnama et al., 2020b). The right laptop has a Moodle server installed with mouse tracking codes, while the left laptop has Ubuntu Desktop OS installed. The role of the latter is to access the Moodle server on the right laptop using a browser and perform one click. The right laptop received the click event and stored it on the database while measuring the network cost of the click event.

The online mouse tracking application was installed on the author’s Moodle server. The resource costs were then measured. The data rate of the network was measured using a tool called Wireshark. The server is an Ubuntu 18.04 Long Term Service (LTS) server equipped with an Intel(R) Core(TM) i7-6800K Central Processing Unit (CPU) @ 3.40 Giga Hertz (GHz) (with SSE4.2) CPU, 32 Giga Byte (GB) of DDR4 Random Access Memory (RAM), 10 Tera Byte (TB) of hard drive, and an allocated 2 Mega Byte per second (MBps) network.

2.2.2 Data Generation Estimation for Implementation Plan

The result on Table 2.2 showed that one click generates around 3-4 kilo Byte (kB) of transmission data. In other words, the mouse tracking application generates around 3-4 kB when one event occurs. The size depends on the metadata where in this case the size greatly increases when date and URL are included because they contain many characters.

Table 2.2 The data generated of one click posted to the server © (Purnama et al., 2020b). The rows before the last row are the types of information, and the last row shows the data rate of the submitted post (Purnama et al., 2020a).
NO
ID
Name
Email
X
Y
leftclick
rightclick
middleclick
keyboardtype
scrollx
scrolly
zoom
touch
touchmove
tab
duration
date
content url
windowsheight
windowswidth
screenheight
screenwidth
data rate (kB) 3.11 3.14 3.14 3.2 3.2 3.22 3.25 3.29 3.43 3.56 3.64 3.72

If the administrator can estimate the amount of users and the average amount of events generated by users, then the administrator can estimate the amount of data to be generated. Rheem, Verma, and Becker, 2018 states that a very high activity is around 70 events per second. Based on Figure 2.9, expect a worst case scenario that a user generates a data rate of 210-280 kilo Byte per second (kBps).

https://file.army/i/B4MmimE
Figure 2.9 A plot of data rate generated by a user based on the events generated per second © (Purnama et al., 2020b). The horizontal axis represents the events per second or frequency in Hertz (Hz) and the vertical axis represents the data rate in kilobytes per second. The different colored lines represent the number of variables included (refer to Table 2.2).

2.3 Overseas Online Mouse Tracking Implementation

2.3.1 Quiz Details

An online quiz session was conducted on the 3rd of January 2019 between approximately 12:00 and 14:30 Japan standard time. There were 2 sessions, with each session lasting approximately an hour and including 20 and 21 students (41 total students participating) from the School of Engineering and Applied Sciences, National University of Mongolia accessing the Moodle server at the Human Interface and Cyber Communication Laboratory, Kumamoto University. The map illustration is shown on Figure 2.10.

https://file.army/i/B4MmjPU
Figure 2.10 Overseas real-time online mouse tracking implementation © (Purnama et al., 2020b). Forty-one clients from the National University of Mongolia, separated into two groups, accessed the Moodle server at Kumamoto University in turns through the Internet and participated in a ten-question quiz session while mouse and keyboard activities were recorded. The Moodle server also measured the resource costs.

The quiz is a part of a mid-term exam of Microprocessor and Interfacing Techniques course for sophomore and junior year students in Department of Electronics and Communication Engineering, National University of Mongolia. The quiz is on https://md.hicc.cs.kumamoto-u.ac.jp. Figure 2.11 shows a screenshot of the Moodle log file and Figure 2.12 shows a screenshot of students grade of the quiz session. The detailed anonymous log files are published in Mendeley Data (Purnama et al., 2020a). The internet protocol (IP) address of the students for example “119.40.99.53” can be tracked by geo-location that it originates from Mongolia and “https://md.hicc.cs.kumamoto-u.ac.jp” which can be nslookup as “133.95.104.1” originates from Japan.

https://file.army/i/B4Mo9Z3
Figure 2.11 Screenshot of moodle log of students from National University of Mongolia who attempted a quiz session on a Moodle server at Kumamoto University © (Purnama et al., 2020b).
https://file.army/i/B4MoBdZ
Figure 2.12 Screenshot of moodle grade of students from National University of Mongolia who attempted a quiz session on a Moodle server at Kumamoto University © (Purnama et al., 2020b).

2.3.2 Amount of Data Generated

The screenshot of mouse tracking log can be seen in Figure 2.13. Based on the data shared in Mendeley (Purnama et al., 2020a), the majority of the events are mouse movements and scrolls. That is because each change that occurred in either on the mouse cursor or scroll positions are captured. Rapid mouse movements or scrolls will generate large amount of data and how much depends on the capabilities of the computer. Theoretically, if the mouse cursor travels a distance of 1000 pixels than the number of mouse movement events generated are 1000, and if the scroll distance from top to bottom is 1000 pixels than the number of scroll events generated are 1000. In short, the capturing of geometrical data which is the x and y coordinates of the mouse cursor and scroll is the cause of the huge data generation. Also, the affect is multiplied to the amount of labels attached such as the user's identity that did the events, the place, and the time of the event occurrences. Just removing the URL label can save a lot of data space.

https://file.army/i/B4Mo7Nq
Figure 2.13 Screenshot of mouse tracking data of students from National University of Mongolia who attempted a quiz session on a Moodle server at Kumamoto University © (Purnama et al., 2020b).

During the quiz session, Figure 2.14 shows that a student is capable of generating a total over 20000 events which is over 80 Mega Byte (MB) transmission data. This means that student had to upload 80 MB of data at the end of the student's mouse tracking session in each page. According to Ookla, 2020 the global average network speed is 9.3 MBps downlink and 3.9 MBps uplink. This means there exist countries with the average network speed below that. Although nowadays are common for university size institutions to have network speed over 100 MBps, those resources are usually already allocated for many things. For example, the author's laboratory was only given 2 MBps network speed, meaning the mouse tracking session can flood the network. This explains why administrators are reluctant in implementing online mouse tracking. Imagine how much data can be generated if online mouse tracking is implemented by the whole university daily and full time.

https://file.army/i/B4MoYKF
Figure 2.14 Total query/rows/events generated by each students during mouse tracking implementation between National University of Mongolia and Kumamoto University and its estimated total data transmission size © (Purnama et al., 2020b). The horizontal axis represents individual students, primary vertical axis is the query/rows/events, and secondary vertical axis is the estimated data transmission size.

The amount of mouse tracking data compared to page view and other conventional web analytic were almost incomparable. Table 2.3 shows that the moodle log and grade of the quiz session were only a few kilobytes while mouse tracking log is already over a hundred megabytes. In that table is also shown other logs that required long duration and many users to reach the amount of data that mouse tracking log has. While a few hard drive storage are enough to store conventional web and educational logs, many more hard drive storage are needed to store mouse tracking logs.

Table 2.3 Comparison of mouse tracking data size to daily pageview, Moodle log and grades, Nasa server log 1995, Open University learning analytics dataset, and HarvardX Person-Course 2013 © (Purnama et al., 2020b).
Log File Duration Students Size
Daily Pageview City Archive 2 Month - 13 kB
Moodle Log and Grades 3h 30min 41 191 kB
Mouse Tracking 3h 30min 41 122 MB
Nasa Server Log 1995 23 days - 153 MB
Open University Learning Analytics 1 Year 32593 442 MB
HarvardX Person-Course 2013 1 Year 301609 33.8 MB

3 Online Mouse Tracking Resource Saving Methods

It is unfortunate that the online mouse tracking resource usages are too much for regular people to implement daily and full time except for special occasions only such as examinations. The ones who can implement online mouse tracking daily and full time are big institutions such as Amazon and Google. Therefore, on this chapter is discussed the novel method of this thesis to reduce the resource usage of online mouse tracking.

3.1 Existing Methods

Existing methods to reduce mouse tracking data transmission are common sense and popular methods where most of them were discussed by Purnama et al., 2020b. They are:

  • Redundant data reduction which is mostly about reducing meta data such as shorting date format, shorting URL, avoiding duplicate or repetitive data, and exclude information deemed unnecessary.
  • Sampling rate reduction which is adding delay to the event capturing. The default is to capture immediately such as every time the mouse cursor and scroll moves even if they are only by one pixel point while with sampling rate reduction, there are pauses in the capturing process for example every 50 milliseconds, 1 seconds, 2 seconds, etc. where the longer the interval the more the data reduction but at the cost of data resolution.
  • Adaptive sampling where the application does not capture if the mouse cursor and scroll are idle, unlike usual eye tracking where the eye gazes are capture every certain interval even though the gaze's position does not change.
  • Compression methods which were researched by Leiva and Huang, 2015 and Martín-Albo et al., 2016.

3.2 Real-Time Online Mouse Tracking

The conventional data transmission method is to transmit the data as a single package at the end of each mouse tracking session. Based on Figure 2.14, this conventional transmission method floods the 2 MBps network. The author anticipated this and implemented real-time transmission (Purnama et al., 2020b) method avoiding often 2 MBps flood which was reduced to data rate of average around 100 kBps. Although the average data rate is 100 kBps, Figure 3.1 shows many spikes where the difference between average and maximum is large which indicates that there were moments of high activities. The highest spike is around 800 kBps. The spikes are not only pointing upward but pointing downward as well which indicates that there are also moments of low activities. Overall, the standard deviation is high where there were times when activities were high and activities were low, thus precise data usage can be difficult to predict.

https://file.army/i/B4MozXe
Figure 3.1 Data rate during mouse tracking implementation between National University of Mongolia and Kumamoto University. The horizontal axis represents 10 minute interval time and the vertical axis represents the data rate in kilobytes per second. The yellow horizontal line shows the average and the vertical lines shows the minimum and maximum during their respective interval © (Purnama et al., 2020b).

The difference between offline mouse tracking, online mouse tracking, and real-time online mouse tracking can be described on Figure 3.2. While offline mouse tracking stores the data in each of the users' computers, online mouse tracking transmits the data to the server. While conventional online mouse tracking stacks the data until the end of every session before transmitting as a single package, real-time online mouse tracking transmits the data immediately after an event occurs every time. Real-time online mouse tracking helps in reducing the probability of bottleneck as illustrated on Figure 3.3. This helps to balance the transmission load.

https://file.army/i/B4Mo4b5
Figure 3.2 Flowchart of mouse tracking © (Purnama et al., 2020b): offline (left), online (middle), real-time and online (right).
https://file.army/i/B4MoK7A
Figure 3.3 Illustration of bottleneck network in regular online mouse tracking and real-time online mouse tracking as a solution © (Purnama et al., 2020b).

3.3 Lossy Online Mouse Tracking

3.3.1 Three Mouse Tracking Preprocessing and Transmission Method

In the end of Chapter 2, it is known that the capturing of geometrical data which are the x and y coordinates of the occurred events and the time stamping of each events are the largest contribution to the data size. If the geometrical data can be reduced then the data size can be reduced as well. Based on many example mouse tracking data analysis, there are three possible cases illustrated on Figure 3.4:

https://file.army/i/B4MoRG4
Figure 3.4 Whole page vs region of interest vs default mouse tracking illustration. The left scroll illustrates summarized event amount that summarizes the number of events occurring on the whole page; the middle scroll illustrates ROI tracking that summarizes the number of events occurring in defined areas, and the right scroll illustrates default mouse tracking that records every event and the precise point where it occurs, forming a trajectory.
  1. Default mouse tracking which is using all of the geometrical data when and where every events that occurred at each coordinate. An example of data visualization that can be generated by default mouse tracking is mouse trajectories and if the time is recorded as well, a video replay of the mouse trajectory can be generated.
  2. Summarized event amount which is not using any geometrical data where only the event amounts are captured not knowing when and where they occurred. Currently only the amounts of duration, mouse clicks, mouse movements, mouse scrolls, zooms, and keyboard typed of each session are captured, sacrificing the position and time information of these occurred events.
  3. Region of interest (ROI) mouse tracking which is using only selected geometrical data where the coordinates are summarized into selected areas. In other words, the mouse tracking is no longer able to identify the coordinates but only get the activity heatmap of the area. Currently the amounts of duration, mouse clicks, mouse movements, mouse scrolls, zooms, and keyboard typed of each session are captured on header, footer, navigation menu, and each of the quiz question area, sacrificing the exact coordinate information of each events. This method is actually a continuation based on previous work by Purnama et al., 2016 and Purnama, Fungai, and Usagawa, 2016.

By knowing the geometrical data that the analysers wants, the storage and transmission cost can be reduced by applying preprocessing and modifying the transmission method based on Figure 3.5. The default one is the real-time online mouse tracking where the event information is immediately sent to the server at the moment it occurred. For the summarized event amount, only the amounts of events are recorded excluding the place and time of occurrence. It is discouraged to update the event amount in real-time because that will cost data on the network. Instead, it is best to utilize the conventional transmission method where the final event amount value is sent only once at the end of each session (refer to Figure 3.2 online mouse tracking transmission not in real-time). Unfortunately, there are still some potential problems to this conventional transmission method implementation where if the user ends the session in haste, the time may not be enough to retrieve the mouse tracking from the client to the server and potentially losing the data. For ROI mouse tracking, the amount of events are accumulated when the mouse cursor is still within a specific area. When the mouse cursor moves to a new area, the event amount information of the previous area is sent to the server, and the process repeats. There is still a limit in determining and labelling web page areas. Usually, it is done manually by the analyzers but this way is very labor and time consuming. It is possible to determine and label areas automatically using offset DOM event, but not in a smart way where it depends on the layout of the web page. After the areas are determined for the ROI mouse tracking, the transmission method is a hybrid of conventional and real-time where the mouse cursor enters an area and accumulates the event amounts, then the result is transmitted after the mouse cursor leaves the area, and the process repeats upon entering a new area.

https://file.army/i/B4Moemn
Figure 3.5 Three Types of Mouse Tracking Flowchart. The left flowchart is default mouse tracking, the middle flowchart is summarized event amount, and the right flowchart is region of interest mouse tracking (Purnama and Usagawa, 2020).

3.3.2 Three Mouse Tracking Preprocessing and Transmission Simulations

Since the author did not have another mouse tracking experiment opportunity, a simulation is conducted based on the previous mouse tracking experiment on Figure 3.6. It is possible to replay the scenario because the date of each events during the mouse tracking session was captured. However, there was a limit at that time that half of the students are using different time zone format which was difficult to simulate and half of the students are excluded leaving only 23 students.

https://file.army/i/B4MoxU1
Figure 3.6 In Purnama and Usagawa, 2020 the simulation is based on Figure 2.10. In this thesis, the server is changed to single board computer Raspberry Pi 3. The reason is to support regions with limited connectivity in Figure 3.7.

Additionally in this thesis, the author simulate the mouse tracking on a single board computer Raspberry Pi 3 to sympathize with those that are in limited connectivity region where the method of mouse tracking quiz session is locally illustrated in Figure 3.7. Also, it is interesting to see how much the Raspberry Pi 3 can handle mouse tracking simulation in terms of CPU and RAM.

https://file.army/i/B4MoSu7
Figure 3.7 Even though the ownership of computer and mobile devices increase drastically, the pace of Internet penetration may not be as fast. Those who are in limited connectivity region may not be able to enjoy online quizzes, let alone mouse tracking. Therefore Purnama et al., 2017 offers a hand carry server solution where the students' computer devices can connect to the teachers' single board computer server that runs quiz and mouse and touch tracking.

Five mouse tracking simulations are performed on a quiz page with a size or dimension of 1920x1080 pixels:

  1. Default mouse tracking simulation without changes in the original mouse tracking data.
  2. ROI mouse tracking where the coordinates are summarized into certain areas for each users. The summarising is based on the flow of time domain where a query based on the summarized coordinates is generated every time a user leaves an area and not a total summary of each area where more information can be found on Appendix A:
    • ROI mouse tracking 1 where the coordinates are summarized into 50 areas which consists of header, title, quiz navigation, navigation, administration, footer, each quiz flags, each quiz questions, each quiz answers, and blank areas.
    • ROI mouse tracking 2 where the coordinates are summarized into 35 areas where the quiz questions and answers each are summarized or combined.
    • ROI mouse tracking 3 where the coordinates are summarized into 20 areas where the each quiz flags are summarized or combined to their respective quiz areas.
  3. Summarize amount of events mouse tracking simulation where the data is transformed by summarizing the event amounts of each users into a query and sent the queries based on the end session time of each users.

3.3.3 Three Mouse Tracking Preprocessing and Transmission Results

The result is that a great reduction in data size is achieved by sacrificing some geometrical data for ROI mouse tracking and all geometrical data for summarized event amount shown on Table 3.1. Surprisingly on the user side, the script total execution time on the browser was also reduced shown on Figure 3.8. The transmission cost was also reduced shown by the reduced data rate on Figure 3.9 which is also in parallel to the server's CPU and RAM usage.

Table 3.1 Comparison of data amount generated from the three types of mouse tracking.
Type Queries Data Size
Default Mouse Tracking 286510 ∼100 MB
ROI Mouse Tracking 1 28048 ∼7.7 MB
ROI Mouse Tracking 2 19061 ∼5.3 MB
ROI Mouse Tracking 3 17880 ∼4.9 MB
Summarized Event Amount 23 ∼16 kB
https://file.army/i/B4MoZds
Figure 3.8 The total script running time of three mouse tracking demo session by the author. The horizontal axis is the mouse tracking method. The data in order are from Mozila Firefox, Microsoft Edge, and Google Chrome. The vertical axis is the total running time in milliseconds. Among the three browsers Mozilla Firefox performs faster than Microsoft Edge and Internet Explore performs faster than Google Chrome for this work u, 2020.

The Raspberry Pi's CPU is not strong enough to handle the default mouse tracking simulation of around 20 users where the CPU often reach 100% usage. Even the RAM usage is abnormally high over hundreds of MB. However, it is able to handle ROI mouse tracking and summarized event amount method. This shows how useful the data reduction method are.

https://file.army/i/B4MouVl https://file.army/i/B4Mo6Kj https://file.army/i/B4MoEXk
Figure 3.9 CPU and RAM usage and data rate comparison between default mouse tracking, summarized event amount, and ROI mouse tracking.

Among the three mouse tracking method, the summarized event amount method is the maximum resource reduction because all the geometrical and time data are excluded or simply only consist of one area. Theoretically, the amount of query is reduced to one per mouse tracking session. For the ROI mouse tracking, does not necessary always result in large resource reduction like the result in this thesis. Theoretically, it depends on the area division of the web page. The smaller the division, the larger the area, the larger the resource reduction, and vice versa. By performing more division, the areas become smaller, the resource usage becomes larger, and eventually the area will become as small is coordinates if areas are kept being divided which will become the same as default mouse tracking.

3.3.4 Synchronization for Hand Carry Server Quiz

The teacher may decide to conduct the quiz locally using hand carry server illustrated in Figure 3.7 for various limited connectivity reasons such as expensive or unstable Internet connection. If the log data is only for the teacher to use, then all is well, but if it is for institutional use, the teacher may have to synchronize the data to the institution's server. It will be wise to use incremental synchronization method illustrated on Figure 3.10 to reduce data especially for large data like mouse tracking log.

https://file.army/i/B4MoOnv
Figure 3.10 Suppose there are two quiz sessions like the one in this thesis. The teacher have to synchronize the data two times which are after the first session and after the second session. Although the human mind knows that it is better to update, the computer today still does not operate that way. Even the default copying in most people desktop still functions as copying the whole data and replacing the old shown on the left. Today, a separate application must be used to perform incremental synchronization shown on the right that is able to calculate the difference between the old and new data © (Purnama, 2017).

There are two ways to perform incremental synchronization. The first one is to store the data in Structured Query Language (SQL) which is mostly used in database applications. SQL stores the data in form of table and to update is just sending new rows from the teacher's database to the institution's database. Most log data are in unidirectional incremental/addition fashion which is why SQL is mostly used. However, if the update is more than just incremental such as correction where there are deletion and modification than it is more complicated for SQL to handle (Purnama, Usagawa, Ijtihadie, et al., 2016). The most popular algorithm to handle this update is the rsync algorithm illustrated on Figure 3.11. Example use case are when teacher forgot to exclude private data when privacy is a concern and accidentally upload to the server. In this case, the teacher would want to remove the private data in each query where rsync can save resource cost. Though, this is less likely to occur. A more realistic case is a teacher needed to update their quiz contents from the server where the update is made of addition, deletion, and relocation.

https://file.army/i/B4MoG7H https://file.army/i/B4MoLGf https://file.army/i/B4MohoI https://file.army/i/B4Mo1UV
Figure 3.11 A detailed illustration of the rsync algorithm procedure where the steps in summary are splitting the data into blocks, scan for blocks relocation, and scan for blocks that does not exist where they can be to be newly added blocks or unused blocks to be deleted. Finally, execute relocation, addition, and deletion based on the obtained information from the scanning (Purnama, 2017).

4 The Depth Levels of Logs

Back in Chapter 1, it was emphasized that conventional web logs and educational data have a limitation regarding to the information that they can derive. Mostly, it was about how those conventional logs could not capture the users or students behavior online. Eye and mouse tracking solves that problem by capturing how the students interact. It took some time for the author to understand and conceptualize the meaning behind those repeating statements about what conventional log data cannot tell while eye and mouse tracking log can tell. It turns out to be that the depth level of those logs are different where eye and mouse tracking logs belong to a deeper level than conventional logs.

https://file.army/i/B4MoQup
Figure 4.1 Six level of web logs in order from most shallow to deepest are Internet, websites, categories, web pages, area, and coordinates.

This thesis defines six depth level of web logs from browser content point of view shown on Figure 4.1. Most analyzers do not know that there are deeper level of logs. Most tools do not generate data in deeper level than web page level logs. The web log depth levels converted to educational data can be illustrated on Figure 4.2. Most educational tools only generate logs up to course content level which are mostly how many time the students attempts the activity and what grade they received. This chapter discussed the three deepest log levels and explained how mouse tracking belongs to the deepest log level.

https://file.army/i/B4MoAVD
Figure 4.2 Six level of educational data in order from most shallow to deepest are Internet, academies, courses, course contents, area, and coordinates.

4.1 Web page / Course Content Level Logs

4.1.1 Conventional Web Logs and Educational Data

The conventional web logs belongs up to the web page level log. They are mainly page views which shows that a web page from a certain website and category have been viewed (Bluehost, 2016). Additional metadata can be attached to the page view:

  • "Who", the identity of the viewer can be identified if the viewer register to the website, provides identity on the browser and gives permission to identify, or if not then the internet protocol (IP) address of the viewer can be captured.
  • "Where" can be the link of the web page or the location of web server and viewer if they are identifiable.
  • "When" is usually the date and time of the occurred page view or any action. More specifically, the duration can be calculated.
  • "What" is usually the action of the viewer labeled by the analyzer. If the web page is a reading content then the viewer's action is labeled as reading. If it is an audio content then the viewer's action is labeled as listening. If it is a video content then the viewer's action is labeled as watching. If it is a forum then the viewer's action is labeled as discussing and etc.

As page view belongs up to the third deepest level log, there is a limit how much it can tell no matter how hard it is analyzed. For example, page view cannot tell how a user is reading a content such as whether the user is skimming or reading in detail. The limit is that page view cannot capture activities that occurred in specific area of the web page. In education, there are four popular logs that are used by teachers which are materials the student read, assignments submitted, topics discussed in forum, and quiz or exams grades. Unfortunately just as conventional web logs, conventional educational data can only tell what activities the students are doing and its duration but cannot tell how the students attempts those activities which can be more emphasized on Figure 4.3. In other words, it can identify a certain extent of what, when, where, and who but cannot identify deeper and how the viewer interacts with the contents (Purnama et al., 2016) (Purnama, Fungai, and Usagawa, 2016).

https://file.army/i/B4Mo0AY
Figure 4.3 The top half of the image shows page view implemented in Moodle which is called course view. It can tell what kind of activity is attempted based on the page label within the course, when in dates, and who by the students' registered names and ip address. The bottom half of the image shows eye tracking that can tell how a user is reading the text which Moodle log cannot tell.

4.1.2 Amount of Interactions

Although the summarized event amount of mouse tracking is on the depth level of web pages or course contents, it is still not widely known by analyzers. DOM events can tell many other interactions users does on the web page. The simplest of them are knowing how much interaction the user does such as how many clicks, how many touch, how much mouse movements, how much scrolls, how much zoom in and zoom out, how many copy and paste, how many times the keyboard was pressed, and etc. Table 4.1 shows that the Mongolian students attempting the quiz session took at average 1368 seconds, performed at average 175 left clicks, 8 middle clicks, 11004 mouse movements, and 4158 scrolls.

Table 4.1 The duration and event amount generated by 41 Mongolian students during a quiz session.
name duration leftclick rightclick middleclick mousemove scroll inactive highlight grade
student1 2572.66 50 0 15 13359 8747 1493 0 72
student2 2188.08 157 5 0 21066 6760 2662 2 32
student3 1659.22 278 1 6 13216 3895 1725 3 60
student4 2236.42 323 0 0 18068 7036 1467 4 84
student5 1916.00 346 1 14 17235 6019 1646 1 96
student6 2345.90 185 0 0 11006 5448 964 0 64
student7 1932.57 422 0 15 13748 2761 1735 0 60
student8 2748.21 173 2 0 12697 6151 1486 2 44
student9 1699.58 317 0 0 14462 4452 1848 0 72
student10 792.32 27 0 41 8436 4578 1125 0 64
student11 1021.46 241 0 0 11907 2018 1629 3 88
student12 691.06 64 0 0 7970 1995 1217 1 -
student13 889.88 610 0 19 11636 3754 1449 0 76
student14 1947.62 342 0 0 20724 4774 3235 0 72
student15 2300.24 37 0 0 15686 7346 2219 0 64
student16 1755.30 385 0 0 16435 5595 2385 3 64
student17 1770.57 29 0 0 16264 9521 1808 0 64
student18 2499.55 117 0 0 8686 2855 1685 1 92
student19 945.06 935 0 0 6491 5085 1199 0 76
student20 823.93 16 0 12 11590 4564 1604 0 60
student21 1624.92 23 0 10 10261 5123 1257 0 72
student22 1314.81 50 0 3 10966 2424 1700 1 96
student23 555.53 16 0 0 7726 1512 1443 0 84
student24 645.91 209 0 0 8814 2122 731 1 56
student25 1040.37 21 0 1 6351 2527 789 0 88
student26 1374.72 30 0 0 7340 2698 950 0 96
student27 837.26 21 0 0 6403 1849 984 0 56
student28 1902.36 173 0 7 15383 4817 2468 3 100
student29 992.71 31 0 0 10144 4335 1514 0 80
student30 859.91 94 0 0 10736 2417 1558 0 64
student31 1208.75 29 0 0 10430 4484 1197 0 76
student32 1030.32 16 0 0 11347 4227 1487 0 68
student33 1603.01 23 0 0 11425 3467 1729 0 60
student34 1098.69 23 0 0 8922 5386 806 0 84
student35 1462.85 27 0 0 7916 6822 1987 0 88
student36 2091.09 1179 3 179 16976 6360 4396 0 76
student37 827.12 26 0 0 6428 3062 719 0 96
student38 74.19 9 0 5 1068 328 156 0 80
student39 682.23 143 0 0 10061 2914 1419 0 68
student40 41.26 3 0 0 633 116 132 0 88
student41 90.92 7 0 0 1182 139 210 0 100
average 1368.16 175.78 0.29 7.98 11004.73 4158.12 1517.39 0.61 74.50
total 56094.53 7207.00 12.00 327.00 451194.00 170483.00 62213.00 25.00 2980.00
minimum 41.26 3.00 0.00 0.00 633.00 116.00 132.00 0.00 32.00
maximum 2748.21 1179.00 3.00 179.00 20724.00 9521.00 4396.00 4.00 100.00
stdev 728.82 295.33 0.72 39.05 4977.88 2291.82 914.11 1.16 15.97

Knowing the amount of DOM event occurrence on a web page may give a hint whether the web page fulfills its purpose or not. For example, a web page designed based on game theory are bound to be interactive where if there are less events such as clicks, movements, etc, may show that the users does not engage on the web page, whereas if the web page is designed for reading and there are many events, then there must be something wrong. The author expect high amount of DOM event done by the students because they are attempting a quiz where they need to perform many clicks to choose an answer, and need to perform many movements to read the questions carefully and maybe reviewing some questions. If there is no problem with the web page then there can be problems with the users. A study showed by Rodrigues et al., 2013 that high amount of events generated by a user can indicate that the user is stressed. Theoretically, there should be a common sense of how much a user should generate events within a certain amount of duration.

4.1.3 Web Page or Course Content Inactivity

Web page or course content inactivity is another DOM mouse event feature that analyzers does not know. In page view, the duration can be counted on visited web page but it cannot tell whether the users are actually in the web page the whole time because they can just open another tab and leave the previous ones open. With mouse DOM events, it is possible to distinguish the amount of active and inactive time of users within a web page. The inactivity is indicated when the mouse cursor leaves the web page for opening another tab or doing other activities and when the mouse cursor re-enters the web page, the status will show active again.

In Table 4.1, the amount of inactivity queries of each student are provided, and in Figure 4.4, the amount of inactivity in time domain are plotted. They showed that all the students does not always stay in the quiz page which opens the possibility that they are seeking information from outside source to answer the quiz better such as searching for answers in search engines and messaging friends online. The amount of inactivities could be exagerated due to system limitation reasons such as slow mouse leaves generates more inactivities query than fast mouse leaves. However, the system design still ensures that no inactivities queries will be generated if the mouse does not leaves the quiz area.

https://file.army/i/B4MopRQ
Figure 4.4 Inactive queries plotted in time domain. The horizontal axis is the time interval in minutes and the vertical axis is the amount of inactive queries.

Aside from capturing inactivities, capturing highlight, copy, cut, and paste can help in detecting dishonest behaviors. An alarm system can be developed to inform the examiners when such events occurred. For important exams such as certifications, stricter systems can be implemented such as immediately failing the test when the mouse cursor leaves the exam illustrated on Figure 4.5.

https://file.army/i/B4MoFna
Figure 4.5 An exam detector that tracks unwanted activities of participants such as mouse leaving the exam, tab and meta button to leave the exam, and other events indicating exam leaving.

4.2 Area Level Logs

Area level logs are logs showing activities within areas of the web page or course contents. This can be done by either or combination of capturing the mouse cursor position, the touch location, the scroll bar position, or tracking the eye ball position. Then capturing the date and time of the events that occurs in those positions. The ROI mouse tracking provides these kinds of information. The amount activity in each area for this thesis is based on the total amount of events.

The most popular analysis of area level logs are heatmap visualization. There are many indications that can be derived from heat maps. For example on a high activity or duration area, may indicate that users are interested in the area. If not, then they may have trouble with the area whether trouble in understanding the content, questions that are too difficult for example on Figure 4.6 that question three receives the most attention which may indicates difficulties, or there was design problems that results in unnecessary efforts on users to capture the information. On the other hand if the area has low activity or duration may indicate that the users are not interested, the design is not well enough to capture the users' attention, or the question in the quiz is simply too easy.

https://file.army/i/B4MofTL
Figure 4.6 DOM and Mouse tracking of a whole class attempting a quiz session summarized into a heatmap. The color represents the duration of mouse cursors staying on an area where short to long durations are indicated from green to red. The number inside an area is the total events of clicks, movements, scrolls, etc combined. The arrows indicates the amount the mouse cursors entering or leaving an area.

Figure 4.7 shows an even more detailed heatmap where the visualization was split into 10 minute intervals. Just from a glance it can be seen that the high activity time is the 30th, 90th, and 160th minute, they took a break on the 130th minute, and they finished on the 230th minute. Another interesting information is that they did not bother much with the last question, maybe whether they are too easy or they just want finish quickly because they are too tired.

https://file.army/i/B4MowL9
Figure 4.7 Mouse activity heatmap in quiz page locations in time series. The horizontal axis represents 10 minute interval time and vertical axis are quiz page locations. For the heatmap, green color is close to minumum activity, yellow color is close second quartile, and red is close to maximum activity.

Figure 4.8 shows another detailed heatmap regarding to the amount of activities done by each students on each area. The heatmap seems to vary to not showing much similarities between each students however, there are some. There can be seen a common correlation on question 13 that there are high activities and looking at the grade/score distribution in Figure 4.9, many students got the answer wrong which maybe common evidence that the question is too difficult for them that they had to take more effort in it. An opposite case is on question 6 where there are low activities but many students got the answer wrong which can lead the analyzer to wonder whether question is a trick question. Another similar case with strong similarity found between Figure 4.8 and Figure 4.9 that students did very little activity on the last question and but most the students got the answer wrong. Unlike question 6, it may not be a trick question but a difficult question because the score allocation is high. There maybe two possibilities where the first possibility is that the students ran out of time and since it is the last question, they may answer randomly, and the second possibility is that the students are lazy and/or tired that when they reach the last question that is difficult, they answer randomly because they may just wanted to finish the quiz quickly, giving up on the last question.

https://file.army/i/B4Mo3oo
Figure 4.8 Mouse activity heatmap in quiz page locations of each students. The horizontal axis are quiz page locations and vertical axis are the students anonymized. For the heatmap, green color is close to minumum activity, yellow color is close second quartile, and red is close to maximum activity.

Those indications can be useful in many ways. For example, if the indications shows that users are not paying attention to areas which are intended to be emphasize by content creators then there needs design fixing or content revision. In education, the heat map can be useful to profile the students. It can then be followed by a guidance system that can automatically detects the students interest which the guidance system can guide the students in many ways such as linking to related resource, suggesting students their career path, grouping them with relevant community, etc. The profile can also be used in a stricter way where the teachers gives assignments to students about reading a context and the system will detect whether the students have sufficiently paid enough attention to the context or not.

https://file.army/i/B4MoWiE
Figure 4.9 Grades/marks/scores the students receives in each questions. The first row is the label of the anonymized students, average score, and the amount of mistake made in each question. The first column is the label of the question number along with the score allocated to each of them. Wrong answers are marked with 0 points and highlighted in red.

Additionally there are some analyzers that counts the amount of mouse entering and leaving the area which is known as the mouse flow. In quiz sessions, it is normal to find many mouse flows because students tends to review or revisit the questions whether to double check or because they previously skipped them. On the other hand, for a website that is meant to guide or share information, many mouse flows may indicate problems for the website such as the users maybe confused in finding the information they need thus searching tirelessly (Hsu, Chang, and Liu, 2018).

A possible application is force reading illustrated on Figure 4.10, for example making sure the students read the agreement to tracked before exam and users read the term of service. The administrator can configure the variables such as the reading duration and amount of activities and areas. Simply, if the user did not read enough the area, then the user cannot pass and must read enough of the defined passage.

https://file.army/i/B4MomuU
Figure 4.10 Illustration of force reading based on the duration of the mouse cursor stays in an area. The left example shows that the mouse cursor did not stay long enough in each area and tells the user to read everything, the middle example shows that the mouse cursor did not stay long enough in middle area and tells the user to complete reading middle area, and the right example shows satisfaction in user's reading.

4.3 Coordinate Level Logs

The coordinate level are the deepest level logs. The coordinate values can either be based on document, screen, or windows perspective. This is the log that the default mouse tracking generates (Purnama et al., 2020b). It is overwhelming but contains the most information where this is the log that most analyzers should want to keep. The more shallow level such as the area level log can be derived from the coordinate level log and it is unidirectional where the vice versa is not possible (Purnama and Usagawa, 2020). The most popular analysis is to draw a mouse trajectory. If the time when the mouse cursor lands on the coordinates are recorded, then it is possible to replay what the users did.

An example visualization that can be drawn from the mouse tracking data is the mouse click trajectory in Figure 4.11. It shows a user highlighting a text which can indicate that a user is paying a attention to that text or attempts to copy that text to save in the user's note or to paste in the search engine to find more information about the text. The amount of highlights the students did was also summarized on Table 4.1 and showed that either the students who highlights gets high or low grade and not average grade. The speculation is that the questions they highlight are too difficult for them and either they succeeded in finding the answers on other sites or failed. Unfortunately, the copy and paste events were not implemented at that time. In fact, it is because the author found this highlighting that motivates the author to add copy, paste, and other DOM events into the mouse tracking application.

https://file.army/i/B4Moof3
Figure 4.11 A visualization of clicks generated using the mouse tracking log of Mongolian students attempting the online quiz session. Left clicks are indicated by triangles, middle clicks are indicated by squares, and right clicks indicated by pentagons. The two interesting parts of these visualization are shown by rapid left clicks on certain text areas that indicates highlighting and rapid middle clicks that indicates scrolling.

Although mouse tracking logs are part of the deepest level logs there is still a limit of how much the mouse cursor and scroll position can indicate because certain events does not necessary have to occur on those positions. For example, reading is based on the eye gaze and typing may occur not far from the mouse and scroll position but not necessarily exactly on those position. Each of these logs alone will not make the best logs but a combination of them. Combining conventional web logs or educational data with mouse tracking and eye tracking may provide a complete log.

5 Conclusion and Future Work

5.1 Conclusion

The author wrote an online mouse tracking application suitable for public implementation and implemented during a quiz session at the Human Interface and Cyber Communication Laboratory, Kumamoto University on the 3rd of January 2019 between approximately 12:00 and 14:30 Japan standard time. The amount of data generated by mouse tracking was investigated during the implementation and found that the cause of huge data generation is the capturing of geometrical data or coordinates of each event. Aside from existing solutions to reduce data, this thesis also implemented and discussed real-time transmission system in mouse tracking data retrieval helps distribute the network's burden across the time domain. The main novelty of this thesis is the select-able geometrical online mouse tracking method where there are possible cases that not all the geometrical data are required. The method allows summarizing of coordinates into areas or deleting the coordinates if they are not necessary. The results showed great reduction in storage and transmission costs. However, the method is lossy because the process is irreversible. Rich mouse tracking data were obtained and in this thesis a new concept of log dept level was discussed with example analysis that include click visualization and activity heatmap which help in identifying the interaction between the students' and the quiz page.

5.2 Future Work

The real-time transmission is not the best solution. A better method is to upgrade the real-time transmission method by integrating smart transmission method where the client can detect the traffic of the network and determine the optimal time for queuing and transmission. Although the select-able geometrical mouse tracking data method works perfectly, there are still problems with execution. If all of the geometrical data are excluded, the most efficient time to transmit the data is only once which is when the user leaves the page. However, the problem lies with the browser where there is currently no way to force the user to wait before the transmission process finishes, leaving potential problem of data loss. The problem for ROI tracking is that it cannot perform smart area determination and labelling. Normally, they are performed by humans. Therefore, one solution is to develop an artificial intelligence for this matter in the future. Finally, this doctoral thesis is only limited to mouse tracking with one type of activity which is examination. There are a various activities such as passage reading, e-commerce, entertainment, Geo-visualization reading, search engine, social media, etc which are open for future work.

Appendix A Data

A.1 Quiz Areas

Area x1, x2, y1, y2 Area x1, x2, y1, y2
Header 0, 1920, 0, 64 Quiz8 Question 529, 1900, 2453, 2493
Title 16, 1904, 150, 270 Quiz8 Answers 529, 1900, 2494, 2730
Quiz Navigation 18, 364, 291, 532 Quiz9 Flag 384, 528, 2731, 3242
Navigation 16, 366, 551, 1042 Quiz9 Question 529, 1900, 2731, 2831
Administration 18, 364, 1062, 1693 Quiz9 Answers 529, 1900, 2832, 3242
Quiz1 Flag 384, 528, 291, 570 Quiz10 Flag 384, 528, 3243, 3580
Quiz1 Question 529, 1900, 291, 341 Quiz10 Question 529, 1900, 3243, 3343
Quiz1 Answers 529, 1900, 342, 570 Quiz10 Answers 529, 1900, 3341, 3580
Quiz2 Flag 384, 528, 571, 852 Quiz11 Flag 384, 528, 3581, 3856
Quiz2 Question 529, 1900, 571, 621 Quiz11 Question 529, 1900, 3581, 3631
Quiz2 Answers 529, 1900, 622, 852 Quiz11 Answers 529, 1900, 3632, 3856
Quiz3 Flag 384, 528, 853, 1133 Quiz12 Flag 384, 528, 3857, 4169
Quiz3 Question 529, 1900, 853, 903 Quiz12 Question 529, 1900, 3857, 3907
Quiz3 Answers 529, 1900, 904, 1133 Quiz12 Answers 529, 1900, 3908, 4169
Quiz4 Flag 384, 528, 1134, 1441 Quiz13 Flag 84, 528, 4170, 4746
Quiz4 Question 529, 1900, 1134, 1184 Quiz13 Question 529, 1900, 4170, 4520
Quiz4 Answers 529, 1900, 1185, 1441 Quiz13 Answers 529, 1900, 4521, 4746
Quiz5 Flag 384, 528, 1442, 1748 Quiz14 Flag 384, 528, 4747, 5295
Quiz5 Question 529, 1900, 1442, 1492 Quiz14 Question 529, 1900, 4747, 5097
Quiz5 Answers 529, 1900, 1493, 1748 Quiz14 Answers 529, 1900, 5098, 5295
Quiz6 Flag 384, 528, 1749, 2027 Quiz15 Flag 384, 528, 5296, 5842
Quiz6 Question 529, 1900, 1749, 1799 Quiz15 Question 529, 1900, 5296, 5646
Quiz6 Answers 529, 1900, 1800, 2027 Quiz15 Answers 529, 1900, 5647, 5842
Quiz7 Flag 384, 528, 2028, 2452 Footer 0, 1920, 5939, 6116
Quiz8 Flag 384, 528, 2453, 2730 Blank Areas except listed here

A.2 Full Quiz Page Heatmap

https://file.army/i/B4MorgZ
Figure A.1 Visualization of mouse tracking data. Default mouse tracking data can visualize exact points of location, the left image is click visualization and the middle image is a heatmap based on the duration the mouse cursor stays on each point, while ROI tracking can only visualize defined areas and show flows between areas shown on the right image.

Appendix B Copyrights

Below are the publications reused in this thesis that does not require copyright clearance:

Below are the publications reused in this thesis that requires copyright clearance and obtained:

https://file.army/i/B4MoJRq https://s100.copyright.com/CustomerAdmin/PLF.jsp?ref=d6af378c-4d45-49da-90cf-5ce0e54d2473

SPRINGER NATURE LICENSE
TERMS AND CONDITIONS

Sep 09, 2020




This Agreement between Mr. Fajar Purnama ("You") and Springer Nature ("Springer Nature") consists of your license details and the terms and conditions provided by Springer Nature and Copyright Clearance Center.

License Number

4852391261684

License date

Jun 19, 2020

Licensed Content Publisher

Springer Nature

Licensed Content Publication

Education and Information Technologies

Licensed Content Title

Implementation of real-time online mouse tracking on overseas quiz session

Licensed Content Author

Fajar Purnama et al

Licensed Content Date

Mar 6, 2020

Type of Use

Thesis/Dissertation

Requestor type

academic/university or research institute

Format

print and electronic

Portion

full article/chapter

Will you be translating?

no

Circulation/distribution

50000 or greater

Author of this Springer Nature content

yes

Title

Development of a Lossy Online Mouse Tracking Method for Capturing User Interaction with Web Browser Content

Institution name

Kumamoto University

Expected presentation date

Jul 2020

Requestor Location

Mr. Fajar Purnama
JL. GN. AGUNG Gg. YAMUNA II, NO. 4


Denpasar, Bali 80119
Indonesia
Attn: Mr. Fajar Purnama

Total

0.00 USD

Terms and Conditions

Springer Nature Customer Service Centre GmbH
Terms and Conditions

This agreement sets out the terms and conditions of the licence (the Licence) between you and Springer Nature Customer Service Centre GmbH (the Licensor). By clicking 'accept' and completing the transaction for the material (Licensed Material), you also confirm your acceptance of these terms and conditions.

  1. Grant of License

    1. The Licensor grants you a personal, non-exclusive, non-transferable, world-wide licence to reproduce the Licensed Material for the purpose specified in your order only. Licences are granted for the specific use requested in the order and for no other use, subject to the conditions below.

    2. The Licensor warrants that it has, to the best of its knowledge, the rights to license reuse of the Licensed Material. However, you should ensure that the material you are requesting is original to the Licensor and does not carry the copyright of another entity (as credited in the published version).

    3. If the credit line on any part of the material you have requested indicates that it was reprinted or adapted with permission from another source, then you should also seek permission from that source to reuse the material.

  2. Scope of Licence

    1. You may only use the Licensed Content in the manner and to the extent permitted by these Ts&Cs and any applicable laws.

    2. A separate licence may be required for any additional use of the Licensed Material, e.g. where a licence has been purchased for print only use, separate permission must be obtained for electronic re-use. Similarly, a licence is only valid in the language selected and does not apply for editions in other languages unless additional translation rights have been granted separately in the licence. Any content owned by third parties are expressly excluded from the licence.

    3. Similarly, rights for additional components such as custom editions and derivatives require additional permission and may be subject to an additional fee. Please apply to This email address is being protected from spambots. You need JavaScript enabled to view it./This email address is being protected from spambots. You need JavaScript enabled to view it. for these rights.

    4. Where permission has been granted free of charge for material in print, permission may also be granted for any electronic version of that work, provided that the material is incidental to your work as a whole and that the electronic version is essentially equivalent to, or substitutes for, the print version.

    5. An alternative scope of licence may apply to signatories of the STM Permissions Guidelines, as amended from time to time.

  • Duration of Licence

    1. A licence for is valid from the date of purchase ('Licence Date') at the end of the relevant period in the below table:

        Scope of LicenceDuration of Licence
        Post on a website12 months
        Presentations12 months
        Books and journalsLifetime of the edition in the language purchased
  • Acknowledgement

    1. The Licensor's permission must be acknowledged next to the Licenced Material in print. In electronic form, this acknowledgement must be visible at the same time as the figures/tables/illustrations or abstract, and must be hyperlinked to the journal/book's homepage. Our required acknowledgement format is in the Appendix below.

  • Restrictions on use

    1. Use of the Licensed Material may be permitted for incidental promotional use and minor editing privileges e.g. minor adaptations of single figures, changes of format, colour and/or style where the adaptation is credited as set out in Appendix 1 below. Any other changes including but not limited to, cropping, adapting, omitting material that affect the meaning, intention or moral rights of the author are strictly prohibited.

    2. You must not use any Licensed Material as part of any design or trademark.

    3. Licensed Material may be used in Open Access Publications (OAP) before publication by Springer Nature, but any Licensed Material must be removed from OAP sites prior to final publication.

  • Ownership of Rights

    1. Licensed Material remains the property of either Licensor or the relevant third party and any rights not explicitly granted herein are expressly reserved.

  • Warranty



  • IN NO EVENT SHALL LICENSOR BE LIABLE TO YOU OR ANY OTHER PARTY OR ANY OTHER PERSON OR FOR ANY SPECIAL, CONSEQUENTIAL, INCIDENTAL OR INDIRECT DAMAGES, HOWEVER CAUSED, ARISING OUT OF OR IN CONNECTION WITH THE DOWNLOADING, VIEWING OR USE OF THE MATERIALS REGARDLESS OF THE FORM OF ACTION, WHETHER FOR BREACH OF CONTRACT, BREACH OF WARRANTY, TORT, NEGLIGENCE, INFRINGEMENT OR OTHERWISE (INCLUDING, WITHOUT LIMITATION, DAMAGES BASED ON LOSS OF PROFITS, DATA, FILES, USE, BUSINESS OPPORTUNITY OR CLAIMS OF THIRD PARTIES), AND
    WHETHER OR NOT THE PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THIS LIMITATION SHALL APPLY NOTWITHSTANDING ANY FAILURE OF ESSENTIAL PURPOSE OF ANY LIMITED REMEDY PROVIDED HEREIN.

  • Limitations

    1. BOOKS ONLY:Where 'reuse in a dissertation/thesis' has been selected the following terms apply: Print rights of the final author's accepted manuscript (for clarity, NOT the published version) for up to 100 copies, electronic rights for use only on a personal website or institutional repository as defined by the Sherpa guideline (www.sherpa.ac.uk/romeo/).

  • Termination and Cancellation

    1. Licences will expire after the period shown in Clause 3 (above).

    2. Licensee reserves the right to terminate the Licence in the event that payment is not received in full or if there has been a breach of this agreement by you.


    Appendix 1 — Acknowledgements:

      For Journal Content:
      Reprinted by permission from [the Licensor]: [Journal Publisher (e.g. Nature/Springer/Palgrave)] [JOURNAL NAME] [REFERENCE CITATION (Article name, Author(s) Name), [COPYRIGHT] (year of publication)

      For Advance Online Publication papers:
      Reprinted by permission from [the Licensor]: [Journal Publisher (e.g. Nature/Springer/Palgrave)] [JOURNAL NAME] [REFERENCE CITATION (Article name, Author(s) Name), [COPYRIGHT] (year of publication), advance online publication, day month year (doi: 10.1038/sj.[JOURNAL ACRONYM].)

      For Adaptations/Translations:
      Adapted/Translated by permission from [the Licensor]: [Journal Publisher (e.g. Nature/Springer/Palgrave)] [JOURNAL NAME] [REFERENCE CITATION (Article name, Author(s) Name), [COPYRIGHT] (year of publication)

      Note: For any republication from the British Journal of Cancer, the following credit line style applies:

      Reprinted/adapted/translated by permission from [the Licensor]: on behalf of Cancer Research UK: : [Journal Publisher (e.g. Nature/Springer/Palgrave)] [JOURNAL NAME] [REFERENCE CITATION (Article name, Author(s) Name), [COPYRIGHT] (year of publication)

      For Advance Online Publication papers:
      Reprinted by permission from The [the Licensor]: on behalf of Cancer Research UK: [Journal Publisher (e.g. Nature/Springer/Palgrave)] [JOURNAL NAME] [REFERENCE CITATION (Article name, Author(s) Name), [COPYRIGHT] (year of publication), advance online publication, day month year (doi: 10.1038/sj.[JOURNAL ACRONYM])

      For Book content:
      Reprinted/adapted by permission from [the Licensor]: [Book Publisher (e.g. Palgrave Macmillan, Springer etc) [Book Title] by [Book author(s)] [COPYRIGHT] (year of publication)

    Other Conditions:


    Version  1.2

  • Questions? This email address is being protected from spambots. You need JavaScript enabled to view it. or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.



    Material details:

    • Original author's name: Fajar Purnama, Tsuyoshi Usagawa
    • Document title: Incremental Synchronization Implementation on Survey using Hand Carry Server Raspberry Pi
    • Book or journal title: Technical Report, vol. 117, no. 65, ET2017-4, pp. 21-24, year 2017, month 5.
    • Portion: Figure 5

    Permission No.: 20GB0052

    IEICE hereby grant permission for the use of the material requested above on condition that their requirements are as follows:

    • Indication of source (e.g., author's name, document title, name of journal, volume/issue/page number, publication date, etc.)
    • Indication of copyright (e.g. "Copyright (c)2016 IEICE")

    Reference

    Mirror

    Source Code Simple Source Code Sample Portfolio Source Code Portfolio Complete
    Number Coin Price USD Holding Current USD Initial USD Profit or Loss USD Profit Taken