Showing posts with label tips for life. Show all posts
Showing posts with label tips for life. Show all posts

Friday, April 12, 2013

Version Control

This is a nerdy post, relevant only for empirical researchers in social sciences. It may also be relevant for those whose job involves the creation of tons of computer files to finish one project, though.

Matthew Gentzkow and Jesse M. Shapiro of Chicago Booth advocate the use of version control in empirical research of social sciences (See chapter 3 of their writing entitled "Code and Data for the Social Sciences: A Practitioner’s Guide").

If you are new to the idea of version control, watch a series of videos from Software Carpentry.

The basic idea of Gentzkow and Shapiro is that social science empirical researchers should think of writing data analysis scripts as developing software to be released to the public. We need to allow other researchers to replicate our empirical findings. For this purpose, we should make public all the codes and datasets once you publish the paper. It's often the case, however, that by the time you publish the paper, your computer directories are cluttered with many files unnecessary to produce the final results. And cleaning them up often ends up the inability of replicating the final results that you have obtained for the paper to be published. Version control can avoid such a problem.

However, it seems to me that the main benefit of version control is something else: to track the evolution of your thoughts on each empirical research project.

We empirical researchers often face a situation like this:

"Well, I need to analyze this particular thing. I think I did it a few months ago. Which files did I write for this purpose? I cannot find them in my computer."

So you have to start from scratch. A massive waste of time.

The branching function of version control (a great illustration can be found on section 3.4 of Git Pro, written by Scott Chacon) helps us avoid this problem. Every time you try out a new way of analyzing the data, create a new branch (call it the test branch). Within the test branch, keep developing your code. If it turns out to be a bad idea, you can stop working on the test branch and go back to the "master" branch. This way, all the new files you wrote for the failed idea disappear from your working directory. All the clutters are cleaned away. However, these files are preserved behind the curtain. If you later find the failed idea to be actually a good one, you can recover all the files you created in the test branch. Then, you can merge all these files in the test branch with those in the master branch very easily.

There are several systems of version control out there. Git appears to be the best one for branching. (And this article confirms my impression.) However, Git itself is a Unix-based software. Its user interface is not particularly friendly unless you are  a computer programmer.

Among a wide range of visualization software for Git (see the partial list provided by the Git official website), I find Gitbox the most intuitive. It's like an iPhone. Without reading a manual, you can use it. It runs on Mac OS. For Windows, I don't know which one is the best.

The only problem with Gitbox is that it does not visualize branches. Perhaps it is a good idea to also use another graphical interface software for the purpose of visualizing branches only. But it seems to me that none of the available software is very good at visualizing branches.

There is one issue with Git per se. It's a "distributed" version control system. That is, you keep all the files in your local computer and, whenever appropriate, sync them with a remote server (a bit like Evernote). And all the previous versions of each file will be stored in your local computer. This is fine if you only write ascii files. It's not fine if you "version-control" binary files such as data and images. If you use Git, therefore, it's a good idea to version-control those scripts to run statistical software only. Data can be reproduced by running those "tracked" scripts each time.

As opposed to the distributed system, there is also a centralized version control system (such as Subversion), which keeps track of file histories on a remote server. (See this article for the comparison of centralized and distributed.) The drawback of the centralized version control is that branching takes time (because each time you create a new branch, you need to download every file from the remote server). If the main benefit of version control is branching, then the distributed system appears to be the way to go.

Another merit of using version control is to make collaboration easy. It's an effective tool to avoid different people edit different parts of the project, ending up lots of conflicts that cannot immediately be resolved. For collaborative use of version control, however, your coauthors also know about version control (which is totally new to anyone in social science) and agree on when to create a new branch and when to "commit" your works. (To commit means to record all the file changes you have made so that they can be tracked in the future.) Which doesn't seem to be easy.

I'm still learning about version control. One thing that I have to figure out is to use Dropbox for version control. Freshmob and Sam Doidge suggest how to do it.


Sunday, June 13, 2010

Extra charges for low-cost airlines

Although a low-cost airline is a great way of traveling in Europe, you always feel ripped off when you book a ticket. This is because the amount you pay is always higher than the price you first see on the booking webpage.

To reduce this psychological pain, I take note of how much I need to pay more than the price shown immediately after the search, for four low-cost airlines that I often use, if I need to check in luggage, want to take the desired seat, and pay by credit card.

Ryanair: 29 euro (20 for 1 check-in luggage up to 15kg, 4 for priority q, and 5 for credit card payment).

EasyJet: 29.75 euro (11 for 1 check-in luggage up to 20kg, 9.5 for speedy boarding, and 9.5 for credit card payment)

Norwegian: 21 euro (8 for 1 check-in luggage up to 20kg, 8 for seat reservation, and 5 for credit card payment)

Air Berlin: 5 euro (for credit card payment)

Thursday, April 01, 2010

Security alert to Hotmail users

It seems some Hotmail email accounts were recently hijacked by someone who wants to promote www.happyhu.com as an iPhone 3G vendor. I keep receiving an email from my former landlord in London (whom I haven't been in contact for long) with the following content:

Hey How are you these days? I bought one apple iphone 3gs black from this website www.happyhu.com ..
It's obvious that this is a spam. But it's hard to mark it as a spam, because all the messages from this person (whom you do know) will then be marked as spam.

I google this message content, and I found this person and this person facing the same problem.

Sunday, October 21, 2007

How to create a PDF poster with LaTeX

If you are an economist or any other researcher using maths, you should be using LaTeX to write a paper or to create a presentation slide. If your paper is accepted in a poster session in a conference, then what would you do to create a poster involving mathematical expressions? Here's what to do.

1. Visit the beamerposter package website.

2. Download "beamerposter-example.zip" from somewhere in the middle of the page.

3. Extract the downloaded zip file.

4. Open "example.tex" with TeXnixCenter or any other TeX editor.

5. Customize it to your taste.

If you repeatedly need to create posters with LaTeX, then do the following in addition:

1. Create a new directory of your preferred name (e.g. beamerposter) under the directory "C:\Program Files\MiKTeX 2.5\tex\latex" or "C:\texmf\tex\latex" (depending on how your MiKTeX is installed).

2. Download "beamerposter.sty" from somewhere in the middle of the beamerposter package webpage, and save it under the directory you just created.

3. Launch MiXTeX option (Start > Programs > MiXTeX > Settings).

4. Click "Refresh FNDB", and wait for a while.

5. Close MiXTeX option.

(See here for why you need to do this.)

Friday, August 10, 2007

Beware of Ryanair's online checkin system

I booked a Ryanair flight to Italy a month ago. By paying a bit more money (4 pounds), I opted for online check-in (and priority boarding).

Two days before the departure, I got email reminding me of the online check-in.

Then I realized that the online check-in system is only available for passengers with an European passport who are NOT travelling from Italy to UK.

Well, I still enjoy the priority boarding, but I feel like I was cheated...

Saturday, May 19, 2007

Tip for copy-and-past

I've long been grumpy about copy-and-pasting. When you want to copy and past some text on a webpage to, say, your Word file, not only letters but also the font format, which is different from your Word file text font format, is also copy-and-pasted. I finally understand how to avoid this. See this webpage.

Sunday, November 05, 2006

How to use MS Word Mail Merge

By browsing the Internet, I learn that the top 5 business schools are Harvard, Stanford, Warton (Penn), Kellogg (Northwestern), and Sloan (MIT). The top 7 include Chicago and Columbia. Duke and NYU are ranked in top 10 or top 15. Europe's top 3 are Insead, LBS, IMD, perhaps ranked in top 20 in the US. Since some development economists are in business schools, I plan to apply to business schools as well.

Buy 90 C4 manilla envelopes (A4 size) at Ryman, 10 minute walk from LSE. Costs 12.48 pounds. Sounds expensive, but don't have time to check other stores. LSE Student Union shop does not sell envelopes in bunches.

Finish writing cover letters. Here's how to efficiently write cover letters by using MS Word mail merge.
1. Create address lists in Excel. The top row should be "First Name", "Company", "Address 1", "Address 2", "City", "State", "Post Code", and "Country". This will facilitates the matching of fields.
2. Save it and close it.
3. Open MS Word and select "Tools -> Letters and Mailings -> Mail Merge..."
4. (Step 1 of 6) Choose "Letters" and click Next.
5. (Step 2 of 6) Click Next unless you already have a template letter.
6. (Step 3 of 6) Click "Browse" and choose the Excel file created in steps 1-2. Then click Next.
7. (Step 4 of 6) Write a template cover letter. Click a location where you want to put the address. Click "More items..." Choose a field you want to put. Click "Insert". Repeat this until all the fields show up. Then click Next.
8. (Step 5 of 6) By clicking ">>" or "<<", choose the address of the school. Click Next.
9. (Step 6 of 6) Click "Edit individual letters..." and edit the template. Save it.
10. Click "Previous" to go back to step 5 of 6. And repeat 8 to 10 until all the letters are completed and saved.

I also need to make address labels which I stick to envelopes.

How to use MS Word's mail merge to create address labels:
1. Create address lists in Excel. The top row should be "Last Name", "Company", "Address 1", "Address 2", "Address 3", "City", "State", and "Country". This will facilitates the matching of fields.
2. Save it and close it.
3. Open MS Word and select "Tools -> Letters and Mailings -> Mail Merge..."
4. (Step 1 of 6) Choose "Labels" and click Next.
5. (Step 2 of 6) Click "Label options..." and choose "8253 - Label" for product number (this works best for me). Click OK and click Next.
6. (Step 3 of 6) Click "Browse" and choose the Excel file created in steps 1-2. Then click Next.
7. (Step 4 of 6) Click "More items..." Choose a field you want to put. Click "Insert". Repeat this until all the fields show up on the address label. Then click "Update all labels". Then click Next.
8. (Step 5 of 6) Make sure that addresses appear in the way you prefer. If not, click "Previous" to go back to Step 4 of 6. If you're happy, click Next.
9. (Step 6 of 6) Click "Print" to print out address labels.

Realize I also need to make address labels for MY address as the sender.

Some schools require teaching evaluations. I create a short note on my teaching evaluation to clarify it.

Revise the CV. Revise the Introduction and Conclusions of the JMP. Go home by the last train.

Monday, August 21, 2006

Make secure your wireless LAN network

As a new housemate moves in today, I re-think about the security of our wireless home network for broadband acccess.

About.com provides a nice summary of tips.

Its fourth tip talks about something called "MAC address". How to get your MAC address is illustrated succinctly here. The link for Windows XP takes you to the same page as for Windows 2000. But what's written there actually works for Windows XP as well. I also found a couple of similar websites, but they are more complicated than this.

Finally, my additional tip: Don't confuse the MAC address for the wireless network equipment with the MAC address for the ethernet adapter. What you need is the former, not the latter.

Saturday, August 12, 2006

Vacuum Cleaners Buying Guide

This kind of thing takes time to learn, but it is very easy to forget. So I'll take note of it here.

There are eight types of vacuum cleaners in the market: upright, canister (or cylinder), stick (or broom), handheld, robotic, wet/dry, carpet steam, hard surface steam (for the last three types, see Lowe's how-to-buy guide for details). In addition, you can have a central vacuum system (see Cana-Vac Central Vacuum System Buyers Guide).

Among these, the most basic types are upright and canister. The upright type is for carpets while the canister type is for hard surface such as kitchen floors, though some canister cleaners these days can also be used for carpets. Here is the check list for deciding which to buy.

1. Beater
The upright type usually has a beater to loosen and bringing dirt out of the carpet. Some canister types also have it.

2. Bagless (or cyclonic) versus bagged
With bagged cleaners, you need to purchase a new bag from time to time. Check if the cleaner has a "Bag Full" indicator. Bagless cleaners allow you to empty the dust bin without the hassle of changing a bag.

3. Filter
Bagged cleaners do not have filters. Bagless ones come with filters. HEPA filters are ideal for people with allergies. Of course, a better filter raises the price up.

4. Cord length and its retractability (or automatic cable rewind)
Make sure that the cord length is long enough. The canister type usually has automatic cable rewind while the upright type often requires you to wind the cord on your own.

5. Suction power
The number of amps has nothing to do with the suction power. Some companies use the unit called "airwatt" to measure the cleaning efficiency. (Do not confuse it with "watt" for amps.) The higher the airwatt is, the more powerful the cleaner is. A reviewer at Shopping.com says, "suction alone does not make a vacuum clean efficiently, it also requires airflow [(the volume of air a motor is capable of moving)] to pick up dirt."

However, most cleaner-makers only report the number of watts...

References:
Electricshopping.com
eHow.com
eBay Guides

Saturday, February 18, 2006

Robotic floor cleaners

Everybody believes that Japan is always at the cutting edge, especially when it comes to consumer electronics products.

Not really true. I'm surprised to find iRobot Roomba. This fancy robot, developed by iRobot three years ago, automatically clean floors, something Japanese household appliance makers couldn't think of.

According to a discussion board on robotic floor cleaners for Japanese people, Roomba is on sale in Japan for 80,000 Japanese yen (near 800 US dollars or 400 British pounds) while you can buy one in the U.S. for 250 US dollars.

Part of the reason is likely to be the fact that Japanese people live in small houses, not enough space for a robotic cleaner to running around the floor. Toshiba produced something similar, but it's now discontinued...

This year iRobot even produces Scooba, which washes, scrubs, and dries the kitchen floor automatically! (See BBC News report.) I want to buy this... (It's not yet to be sold in UK.)

Anyway, that's a distant dream. All I need to do now is to find the best hoover in UK. Do you have any suggestions? Is Henry, the British original hoover, really good? (I found a Japanese website devoted to Henry and his family, which is far more informative than any English webpages on Henry... (Look at the official website of Numatic, the British company making Henry. The company doesn't seem to realize Henry's potential appeal to a wide range of consumers.)

Friday, January 06, 2006

Making Presentation Slides including Maths

I can sometimes become a computer geek. Here's proof.

Economics uses mathematics. Writing an academic paper in economics requires a word processor capable of inputing and outputing mathematical expressions. Microsoft Word is horrible for this purpose. So most economists use a word processor software called Scientific Workplace.

Still, Scientific Workplace is horrible. Its help files are hieroglyphics. I've already wasted tons of time to figure out how to do one thing or another with this software. But as long as writing a paper is concerned, it is not too bad.

When it comes to making slides for presentation, however, Scientific Workplace is even more horrible. Basically it's useless for making decent slides. Microsoft PowerPoint coupled with TexPoint doesn't work quite properly, either.

I figure out the ideal software environment for making slides for presentation in the economics profession (or any academic discipline using mathematics): MiKTex + TeXinicCenter + Prosper. (For Prosper, this page at University of Colorado is more useful than the official one.) They are all free to download and install.

In case you're interested, here's the recipe:
1. Download and install MiKTex.
2. Download and install TeXinicCenter.
3. Download and install Ghostscript (this is required to use Prosper).
4. Join TeXnicCenter-Users Yahoo! Group (this is necessary to download a batch file in the next step).
5. After your Yahoo! Group membership is accepted, download prospermake.zip by following the link provided in this message.
6. Unzip prospermake.zip and follow the instructions given in readme.txt to create a new output profile for TeXnicCenter.

One warning: do not install any software (MiKTeX, TeXnicCenter, or Ghostscript) under the directory of "C:\Program Files". The reason is this directory name includes a space, which prevents the new output profile from working properly (yes, this is Microsoft's fault). The simplest way to do this is install them all right under "C:" directory.

Now you're ready to use Prosper. You don't need to download and install Prosper itself; MiKTeX automatically does it when you compile a TeX file meant to use the Propser package. (You need an Internet connection, though.)

A couple of technical notes: Steps 3-6 are needed as TeXnicCenter is not compatible with Prosper by default. The main reason is that PDFLaTeX accompanied with TeXnicCenter doesn't work with Prosper. You need PS2PDF to create a PDF file. Ghostscript has this, and a batch file included in prospermake.zip allows you to use it properly.

Saturday, July 02, 2005

Spanish names

By tracking the names of dictators in Latin America, I'm confused with their names a lot. Take the Somoza dynasty in Nicaragua. The founder's name is Anastasio Somoza Garcia, who ruled the country since 1937. When he was assassinated in 1956, his son Luis Anastasio Somoza Debayle succeeded. In 1967, Anastasio Somoza Debayle, Luis's brother, succeeded.

How complicated! Especially, the last guy's name has no original part!

Then I found this website explaining how Spanish-speakers name their children. Finally it makes sense. Take the second-to-last name as the family name!

Tuesday, March 29, 2005

Turntables

I've been wanting to buy a turntable as most drum and bass tracks are only available in vinyl. At last, I did a bit of research by Googling a lot this morning. After two hours of research, I found it costs no less than nearly 300 quid for my purpose.

There are basically two types of turntables out there: home-use and DJ-use. Main differences between these two are (1) belt drive or direct drive; (2) with or without built-in phono-preamplifier; (3) without or with pitch control.

The first difference is about whether or not you are unable to scratch records. For my purpose, I don't necessarily need to scratch, but if possible, I want to practice scratching just for fun. So the DJ turntable wins here.

The second difference is about whether or not you want to connect your turntable to the Hi-Fi system without a phono input. If you just plug a turntable into your stereo, the sound level will be very low. You need to amplify the vinyl sound. But my stereo does not have a phono-preamplifier. Connecting a DJ turntable to line-in inputs won't work. The home-use turntable wins here.

The final difference is about whether or not you are happy with the speed of music just as fast as recorded. I want to listen to drum and bass with vinyl. But DJs always play records a bit quicker than the recorded speed. I'm used to that speed by going clubbing or listening to the radio. So I want to control the pitch of music. The winner in the final round is the DJ turntable.

So I need a turntable with built-in phono-amplifier AND pitch control. The only option for this type of turntable seems to be Vestax BDT-2500, which costs nearly 300 quid. This one is a belt-drive type. So I can't scratch. There seems to be no direct-drive turntable with an amplifier built-in.

Or I can buy Numark TT200 for less than 200 quid (reputable DJ turntables like those made by Technics and Vestax cost more than 300 quid) and some phono-amplifier like TerraTec Phono PreAmp Studio USB for 100 quid.

But I don't have 300 quid. My budget is 200 quid...