Some Java tips
For every one who comes from the world of dangling pointers and manual resource management of C++, Java deceptively looks very similar, unless you look under the hood. (I’m pretty much a newbie to Java, who is forced to learn its quirks, primarily due to Hadoop). Most of these tips are written for grad students who’s life cycle can be succinctly paraphrased as “program, collect data, evaluate, rinse, repeat”.
Tip 1. Get a good IDE (read Eclipse)
Unlike the world of C++, where the standard library is essentially small and can for the most part be held in your head, Java has a massive standard library which includes features for everything under the sun. Keeping that in your head is not for the light of heart. Thats where a good IDE like Eclipse comes in, especially the content assist features can make your life a lot easier. Also, the IDE can take a lot of pain out of managing your projects and libraries (because Java has a preferred conventions for writing and using libraries). One more amazing thing with Eclipse is that its got a nifty Java debugger which lets you inspect among other things - the threads currently running in your program.
If you think eclipse is massive bloat or don’t have access to a powerful system for doing your development, (some times I do, where I have to ssh into some machine and change, recompile, rinse, repeat if I am running experiments) then get JDEE (an Emacs plug in).
Tip 2. Get Ant for Building
Ant is a simple Java build system based on XML (yeah.. yeah, you’ve to write XML, stop complaining). In fact, you can use their tutorial build file and start from there. It works with very little modifications, unless you have a special project needs (In which case, Internet is your friend). Say good bye to make files and man up and write some XML.
Tip 3. When you have multiple cores, use Java threads.
Java has threading built in, even though there are some pains with the thread synchronization, Java makes things a bit easier and the best part is on any Linux distribution with NPTL (which is almost every distribution out there), the Java threading model is essentially 1-1, which means you can use all those cores to do the hard work. This is great if you are running experiments under different conditions, in which case, each run of the experiment is essentially independent of others. This usually gives you a performance boost of more that 20-25% (well, but then, I have access to a machine with 8cores and 8gigs of RAM, so YMMV).
Here is a really nice tutorial to get started on Java threads
Tip 4. Learn to use Scanner and Console classes.
For those who mostly work with input output formats from text files Scanner is one of your best friends. It essentially encapsulates the work of reading a line, splitting it into various parts and converting them into your favorite data types. And for all your input requirements learn to use the Console class, which provides a nice interactive console. If you the simple readLine method does not satisfy your needs, Java has a complete regex library, and again, the internet is your friend.
Tip 5. Use jconsole
And for the last tip of the day, learn to use jconsole a monitoring tool for java applications which lets you inspect the amount of memory usage and active threads. If you are using this tool locally, then for the most part there is no configuration required - type jconsole at a command prompt and you should see a window with the list of process that are open and their PIDs. connect to one that you want to monitor and you are ready to track your java process.
Things get a bit tricky with remote process and security issues - so my advice is setup a VPN or use ssh tunneling. More information on using jconsole is available here.
So, thats all for now.
Signing off,
Vishnu Vyas.
What do you get when you mix nerdy humour with homoeroticisim?
Jake and Amir - That’s right, a nerdy show about a lonely looserish guy obsessed with another guy which borders on the homoerotic. Here is a fun clip to get you started .
Lunch Meeting from Amir on Vimeo.
Signing Off,
Vishnu Vyas
How much time does it take to secure a linux system?
4 hours, yeah.. 4 fucking hours, especially if you are a newbie to the whole “networking” -iptables, ipchains thingamajiks…
I am trying to setup a small python based annotation engine and I am planning to let it into the wild on the internet (the horror!!) and as any normal chump who’s seeing the whole “web is the way to go for apps” mentality everywhere, I setup my application server behind apache using mod_proxy and let it run for sometime. And sometime in the fast moving internet space is 3 days, and on the third day.. I check my logs and I see lots and lots random people from all over the world trying to hack my damn server. Well this story is not about them.
So I decide to setup iptables - thats a pretty darn good idea, you might say.. except for one thing.. I don’t know anything about iptables. So, after browsing for almost an hour on tutorials, howtos, message boards, google groups..(has anyone noticed the search in google groups sucks?) I still couldn’t get anywhere.
Every tutorial out there seems to want to teach me what a TCP packet is or what link layer protocols are or the history of the whole IPTables filtering. Many would say thats great, you learn from the basics, you get your concepts straight. And to them I say “F*#$ you”. I just want to secure my damn server, not take the RHCE. And finally after three more hours of digging and reading the various “subtleties” of the IP protocol, I finally maanged to figure out what to do to secure my server.
Write 2 lines. Yeah, just 2 lines - the result for spending 4 fucking hours is not enlightenment, just getting to write two lines. For those who are using mod_proxy and don’t have linux networking guru to service you at your every beck and call, here are those two lines :
/sbin/iptables -A INPUT -p tcp -m tcp -s “your-hostname/ip/trusted subnet” -dpt:”application server port” -j ACCEPT
/sbin/iptables -A INPUT -p tcp -m tcp -dpt:”application server port” -j DROP
Where “your-hostname/ip/trusted subnet” should usually refer to the machine on which apache is running, In my case, the same machine. The “application server port” is the port on which CherryPy listens, by default i think its 8080. If you have multiple instances of CherryPy running, you would need to add similar rules for each instance (note : add the ACCEPT rules first, before you do the DROP rules).
Signing off,
Vishnu Vyas
One more reason to love L.A.
If things like a jam-packed I-10, irrationally one-way downtown streets and high crime rates with falling house prices get you down.. don’t be to gloomy.. Its LA after all and this would just make you smile.
This is what you get when you mix raw improv talent with a little league game in hermosa beach…
Signing Off,
Vishnu Vyas.
My emacs wishlist.
Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.
-Zawinski’s Law
Now that emacs has fulfilled its divine destiny via RMail, GNUS and various other add ons, the next step for emacs should obviously be something that most people have wished for - Multithreading. Emacs needs multithreading so badly that its almost impossible to do anything of significance especially with tramp comming included with emacs, I don’t have to wait till a file gets saved over the network. I might just want to start making more edits right away. Inferior process are fine, but emacs and particularly elisp is still in the days of DOS. With multicore cpu’s becoming commodity hardware, I think the biggest priority is to implement multithreading for emacs.
Thats all for now
Signing off,
Vishnu Vyas.