Book Review: Web Operations: Keeping the Data on Time

Image

 

For my kickoff of systems engineering book reviews I have chosen this book. While not being technical in the strict sense of the term (if you are looking for code snippets or ready-to-use architecture ideas, look elsewhere), this collection of 17 essays provides a birds-eye view of the relatively new principle of Web Operations. As you will see from the short TOC below, no stone is left unturned and broad coverage is given to a range of subject ranging from NoSQL databases to community management (and all the points in between). This is what you will be getting:

  1. Web Operations: The career
  2. How Picnik Uses Cloud Computing: Lessons Learned
  3. Infrastructure and Application Metrics
  4. Continuous Deployment
  5. Infrastructure As Code
  6. Monitoring
  7. How Complex Systems Fail
  8. Community Management and Web Operations
  9. Dealing with Unexpected Traffic Spikes
  10. Dev and Ops Collaboration and Cooperation
  11. How Your Visitors Feel: User-Facing Metrics
  12. Relational Database Strategy and Tactics for the Web
  13. How to Make Failure Beautiful: The Art and Science of Postmortems
  14. Storage
  15. Nonrelational Databases
  16. Agine Infrastructure
  17. Thing That Go Bump in the Night (and How to Sleep Through Them)

Where can someone starts? Giving a chapter-by-chapter play is not the preferred way – chapters are short and to the point and use a variety of formats – one of them is a long interview for example, so I am going to talk about the overall feel of the book.

The roll-call of the book is impressive. I am sure that if you worked in the field for a little while, names like Theo Schlossnagle, Baron Schwartz, Adam Jacob, Paul Hammond et al, speak for themselves. Every chapter serves as a gentle introduction to the relevant subject matter – this is to be expected as the topics are quite deep and each one carries a huge assorted bibliography. What I particularly like about this book is not only the gentle introduction, it is also written in a way that makes in approachable to technical managers, team leaders and CTOs – chapters such as the one on postmortems and the ones on metrics are prime examples of this. What is awesome is that the book helps you identify problem areas with your current business (for example the lack of using configuration management such as Puppet or Chef) and provide you with actionable information. Extra points for openly acknowledging failure, there are more than two chapters related to it (as the saying goes, if you operate at internet scale, something is always on fire, someplace), including a chapter on how to conduct efficient postmortems. Even non-technical areas such as community management are covered, illustrating that not everything is technology oriented only in the area of running an internet business today.

Your experience with this book will greatly vary. If you are new to the topics at hand, then you might benefit by reading each and every chapter of the book and then revisit it from time to time – as your experience grows, so the number of useful ideas you can get out of this book will increase too. If you are an experiences professional, while this book might not be an epiphany, there is still useful content to apply and perhaps a few additional viewpoints might present themselves.

Overall? An excellent book for everyone involved in running an internet business with a lot of value and a long shelf life.

A final nice point is that proceedings from this book go to a charity, that is a nice touch.

Coming up on Commodity

For the past few months I have been silent, with the last entry being a re-blog from xorl’s (defunct?) blog. That is quite a long time for a writer’s block, eh? Well, here is some insight: professionally I have somewhat moved away from security to towards a systems engineering paradigm. While security still plays an important part both professionally and on my personal time, it is not the dominant focus. Building systems engineering skills is hard work, especially of focus on the engineering part as opposed to the systems part (e.g. systems administrator and systems engineer should not be interchangeable terms). My plan is to publish reviews of books and other resources that I found helpful during my journey, as well as some original hacks that I have made. I have a strict policy of not posting stuff related to $DAYJOB but I am more than willing to share some nuggets of experience. So stay tuned and say hi to the revitalized Commodity blog!

Introduction to Sensu

Introduction to Sensu

Slide deck for an internal presentation I gave on Sensu a few months ago.

I recently finished this book and really, after a long, long time I wanted to write a review. xorl beat me to it 😦

xorl %eax, %eax

Everybody in the “security world” knows Michal Zalewski and his work especially in the field of web security and exploitation. So, with no further introduction here is my review of his new book, “The Tangled Web“.



Title: The Tangled Web: A Guide to Securing Modern Web Applications
Author: Michal Zalewski

Chapter 1: Security in the World of Web Applications
Here we have a nice introduction to the web application security going through all the required theoretical information as well as useful historical references.

Part I: Anatomy of the Web
Chapter 2: It Starts with a URL
Although a chapter dedicated to URL might initially seem like an overkill, M. Zalewski proves the opposite. In this chapter we can see that are so many details in parsing URLs correctly that is extremely difficult to have an application able to handle all of them properly.

Chapter 3: Hypertext Transfer Protocol

View original post 815 more words

Rediscovery and Security News

First things first: Happy 2012 everyone.

So, this blog has been silent for a little while now. More astute readers might argue along the lines of “hey man! This is supposed to be a technical blog – where are all them technical articles? Have you ran out of material?”.

Take a deep breath, the dreaded, almost compulsory metablogging block after a long pause is coming …

The answer is a big NO! There is an abundance of material that I am proud of BUT a lot of this research has been done while solving problems for paying clients. The problem can be refined as “how do you tip-tap-toe around NDAs and do you choose to do so?”. Smart money says not to do it, so I am not. Keep this point in mind for the latter part of this post.

One of the design decisions for this rebooted blog was that it should confer an era of positivity, at least by security and research standards, which is not the happiest of domains. So, for better or worse, I decided to bottle the acid for some time, even if that meant leaving gems such as the following (courtesy of a well known mailing list) untouched:

I have problems with those that create malware – under the guise of
“security research” – which then gets used by the bad guys.

I’m not saying that one can never stop breaking into things. I just
don’t like the glorification of creating malware by the so-called
“good guys”. If all of that energy instead was placed into prevention,
then we would be better off.

[SNIP]
P.S. One might argue that a whitehat or security researcher can’t
change sides and go into prevention, or in other words, be a Builder
instead of a Breaker. They can’t because they don’t have the skills to
do it.

Finished picking your jaw off the floor? Good! While Cpt. Obvious is on its way with the usual “vuln != exploit != malware” reply, let’s get things moving with a pet peeve of mine that I have not seen addressed.

Almost every time a new security trend comes out, there is nary a hint that this might have been discovered some place else or sometime before. Given that security overlaps a lot with cryptography, I just cannot get around my head around the fact while rediscovery is a well accepted notion within the cryptography field (and this has been proved time and time and time again) that while something you are “discovering” might have been discovered (and countered!) before.

Enter infosec, an ecosystem where NDAs are ten-a-penny, the underground is more tight-lipped than ever, the general consensus is that confidentiality is a necessity and where a lot of “discoveries” are handled either via the black-market (and lack of morals implied therein) or via security brokers. It was all fine and dandy but the introduction of both fame-seeking researchers and “researchers” as well the very fact that infosec makes for entertaining and sensationalist headlines that actually “sell seats in the audience” and everyday we are constantly bombarded with “news” and “research” (use of quotes intentional if you haven’t guessed already) where it can fall into one of the following categories:

  • News from the obvious department. This one is getting more and more annoying lately but it is much too obvious a target
  • Less obvious stuff that falls below the radar of cargo-cult security but still way more likely to have been encountered in the field by serious practitioners who fall into one of the non-disclosure categories listed above
  • Actual new and/or insightful findings, which tend to be lost within the sea of useless information, the stuff that REALLY makes your day
  • Since there is a very fine line between 2 and 3 (again, 1 is way too easy of a target to make fun of or suggest anything) and one can never be sure in such a rapidly and secretive landscape, for the love of $DEITY, next time see something related to infosec findings, keep in the back of your head that this might be a rediscovery and dear reporters, PLEASE DROP THE SENSATIONAL HEADLINES.

    I am not holding my breath that this will ever happen but one can only hope …

    PS:
    Finally, an image courtesy of blackhats.com infosuck webcomic. Not exactly the point that I am trying to convey but the message is quite similar and in any case it is much too funny to be left outside the party.

    P For Paranoia OR a quick way of overwriting a partition with random-like data

    (General Surgeon’s warning: The following post contains doses of paranoia which might exceed your recommended daily dosage. Fnord!).

    A lot of the data sanitisation literature around advises overwriting partitions with random data (btw, SANS Institute research claims that even a pass with /dev/zero is enough to stop MFM but YPMV). So leaving Guttman-like techniques aside, in practice, generation of random data will take a long time in your average system which does not contain a cryptographic accelerator. In order to speed up things, /dev/urandom can be used in lieu of /dev/random, noting that when read, the non-blocking /dev/urandom device will return as many bytes as are requested, even if the entropy pool is depleted . As a result, the result stream is not as cryptographically sound as /dev/random but is faster.

    Assuming that time is of the essence and your paranoia level is low there is an alternative which you can use, both providing random-like data (which means you do not have to fall back to /dev/zero and keep fingers crossed) and being significantly faster. Enter Truecrypt. Truecrypt allows for encrypted partitions using a variety of algorithms that have been submitted to peer review and are deemed secure for general usage. I can hear Johnny sceptical shouting “Hey wait a minute now, this is NOT random data, what the heck are you talking about?”. First of all, Truecrypt headers aside, let’s see what ent reports. For those of you not familiar with ent, it is a tool that performs a statistical analysis of a given file (or bitstream if you tell it so), giving you an idea about entropy and other way way useful statistics. For more information man 1 ent.

    For the purposes of this demonstration, I have created the following files:

  • an AES encrypted container
  • an equivalent size file getting data from /dev/urandom (I know, but I was in a hurry )
  • a well defined binary object in the form of a shared library
  • a system configuration file
  • a seed file which contains a mixture of English, Chinese literature, some C code, strings(1) output from the non-encrypted swap (wink-wink, nudge-nudge)
  • Let’s do some ent analysis and see what results we get (for the hastily un-strict compliant Perl code look at the end of the article)

    ################################################################################
    processing file: P_for_Paranoia.tc 16777216 bytes
    Entropy = 7.999988 bits per byte.

    Optimum compression would reduce the size
    of this 16777216 byte file by 0 percent.

    Chi square distribution for 16777216 samples is 288.04, and randomly
    would exceed this value 10.00 percent of the times.

    Arithmetic mean value of data bytes is 127.4834 (127.5 = random).
    Monte Carlo value for Pi is 3.141790185 (error 0.01 percent).
    Serial correlation coefficient is 0.000414 (totally uncorrelated = 0.0).

    ################################################################################
    processing file: P_for_Paranoia.ur 16777216 bytes
    Entropy = 7.999989 bits per byte.

    Optimum compression would reduce the size
    of this 16777216 byte file by 0 percent.

    Chi square distribution for 16777216 samples is 244.56, and randomly
    would exceed this value 50.00 percent of the times.

    Arithmetic mean value of data bytes is 127.4896 (127.5 = random).
    Monte Carlo value for Pi is 3.143757139 (error 0.07 percent).
    Serial correlation coefficient is -0.000063 (totally uncorrelated = 0.0).

    ################################################################################
    processing file: seed 16671329 bytes
    Entropy = 5.751438 bits per byte.

    Optimum compression would reduce the size
    of this 16671329 byte file by 28 percent.

    Chi square distribution for 16671329 samples is 101326138.53, and randomly
    would exceed this value 0.01 percent of the times.

    Arithmetic mean value of data bytes is 82.9071 (127.5 = random).
    Monte Carlo value for Pi is 3.969926804 (error 26.37 percent).
    Serial correlation coefficient is 0.349229 (totally uncorrelated = 0.0).

    ################################################################################
    processing file: /etc/passwd 1854 bytes
    Entropy = 4.898835 bits per byte.

    Optimum compression would reduce the size
    of this 1854 byte file by 38 percent.

    Chi square distribution for 1854 samples is 20243.47, and randomly
    would exceed this value 0.01 percent of the times.

    Arithmetic mean value of data bytes is 86.1019 (127.5 = random).
    Monte Carlo value for Pi is 4.000000000 (error 27.32 percent).
    Serial correlation coefficient is 0.181177 (totally uncorrelated = 0.0).

    ################################################################################
    processing file: /usr/lib/firefox-4.0.1/libxul.so 31852744 bytes
    Entropy = 5.666035 bits per byte

    Optimum compression would reduce the size
    of this 31852744 byte file by 29 percent.

    Chi square distribution for 31852744 samples is 899704400.21, and randomly
    would exceed this value 0.01 percent of the times.

    Arithmetic mean value of data bytes is 74.9209 (127.5 = random).
    Monte Carlo value for Pi is 3.563090648 (error 13.42 percent).
    Serial correlation coefficient is 0.391466 (totally uncorrelated = 0.0).

    Focusing on entropy, we see that
    Truecrypt: Entropy = 7.999988 bits per byte.
    /dev/urandom: Entropy = 7.999989 bits per byte.

    which are directly comparable (if you are trusting ent that is) and much better than a well structured binary file (5.666035 bits per byte) and heads and shoulders our seed.txt results (which is a conglomerate unlikely to be encountered in practice). Chi-square entropy distribution values are different by a factor of 5 in our example, in favor of /dev/urandom data, which is still way more than the data encountered in our other test cases.

    From the above, there is strong indication that when you need random-like data and /dev/urandom is too slow (for example, as I will elaborate on an upcoming post), for example when you want to “randomize” your swap area, a Truecrypt volume will do in a pinch.

    #!/usr/bin/env perl
    use warnings;
    use File::stat;
    # a 5 min script (AKA no strict compliance) to supplement results for a blog article
    # why perl? Nostalgia :-)

    @subjects = qw(P_for_Paranoia.tc P_for_Paranoia.ur seed /etc/passwd /usr/lib/firefox-4.0.1/libxul.so);
    sub analyzeEnt {
    my($file) = @_;
    my $sz = stat($file)->size;
    my $ent = `ent $file` ."\n";
    print "#" x 80 . "\nprocessing file: $file ". $sz ." bytes\n".$ent;
    }
    foreach my $subject (@subjects) {
    &analyzeEnt($subject);
    }

    A Playstation Network intrusion post-mortem *I* would like to see

    By now, I am sure everyone would have heard how Sony’s Playstation Network (and assorted services) are down for the better part of a week, with no definite restoration date in sight, due to an “external incursion”. Sony was so nice as to offer a FAQ (you might also be interested in this wired article). However, there are a couple of points that I take issue with:

    1) First of all, let’s stop the blame game. All I see so far from major media outlets was “Was it Anonymoys? Was it the rebug firmware modder guys?” (Both of these parties have denied involvement, you can read more about rebug here). Instead of that, can we get some full disclosure on WHAT happened, WHY it has happened and WHAT steps have been undertaken to address so that will never happen again? So far, from Sony’s FAQ:

    Q.3 What is the main reason to this problem? Which parts of the system were vulnerable to the intrusion?
    We are currently conducting a thorough investigation of the situation. Since this is an overall security related issue, we will not comment further on this case.

    When I, among millions of other people, trust my details to a network, I place some trust. When this network is PSN (and it is not like Sony is a Mon-n-Pop shop not able to afford top-notch security equipment and advice, if the PS3 stayed “unhackable” for 4+ years, as far as the general public is concerned, proves that they can allocate enough security resources in what they deem critical). Given that details include:

    Q.6 Does that mean all users’ information was compromised? Tell us more in details of what personal information leaked.

    In terms of possibility, yes. We believe that an unauthorized person has obtained the following information that you provided: name, address (city, state/province, zip or postal code), country, email address, birthdate, PlayStation Network/Qriocity password, login, password security answers, and handle/PSN online ID. It is also possible that your profile data may have been obtained, including purchase history and billing address (city, state/province, zip or postal code). If you have authorized a sub-account for your dependent, the same data with respect to your dependent may have been obtained. If you have provided your credit card data through PlayStation Network or Qriocity, it is possible that your credit card number (excluding security code) and expiration date may also have been obtained.

    I, for one, would like to fully know what intrusion countermeasures are in place so details such as these (can you spell “identity theft”?) are relatively safe. The “you did not have to provide them will all these details” defense simply is not applicable, thanks to the “provide us your details or else you will not be able to deathmatch online” policy of Sony. As was the case with other major data breaches, could that signify that security measures were simply lacking? While this can be (and so far IS) a PR disaster for Sony, a way to regain some “face” is to admit any mistakes, share relevant information for the benefit of the community (both the user community and “infosec” community). As for “but do you really expect to give out a list of countermeasures in place”?, well, security by obscurity never, ever had worked in the past, why should it work now?

    2) While this is not strictly Sony’s fault, and appears to be a standard mode of operation, why the onus of protecting my data is on me, the average user? Again, let’s quote the FAQ:

    Q.9 I want to know if my account has been affected.

    To protect against possible identity theft or other financial loss, we encourage you to remain vigilant to review your account statements and to monitor your credit reports. Additionally, if you use the same user name or password for your PlayStation Network or Qriocity service account for other unrelated services or accounts, we strongly recommend that you change them. When the PlayStation Network and Qriocity services are back on line, we also strongly recommend that you log on to change your password.
    For your security, we encourage you to be especially aware of email, telephone, postal mail or other scams that ask for personal or sensitive information. Sony will not contact you in any way, including by email, asking for your credit card number, social security number or other personally identifiable information. If you are asked for this information, you can be confident Sony is not the entity asking.

    Q.10 What should I do to prevent any unauthorized use of my (credit card) personal information?

    For your security, we encourage you to be especially aware of email, telephone, postal mail or other scams that ask for personal or sensitive information. Sony will not contact you in any way, including by email, asking for your credit card number, social security number or other personally identifiable information. If you are asked for this information, you can be confident Sony is not the entity asking. Additionally, if you use the same user name or password for your PlayStation Network or Qriocity service account for other unrelated services or accounts, we strongly recommend that you change them. When the PlayStation Network and Qriocity services are back on line, we also strongly recommend that you log on to change your password.
    To protect against possible identity theft or other financial loss, we encourage you to remain vigilant to review your account statements and to monitor your credit reports.

    So, a network which has Sony’s backing (read: “practically infinite amount of money”), instead of owing up to their mistakes, assumes a stance which, once you get through all the legalese, effectively amounts to “oh we lost your data and we will not even let you know what measures we had in place to address such attacks and we will not give you a heads up what caused all this, neither we will let you know what we have done to address this issue after the fact, sorry if you get any charges on your credit card or if your details are now in the hands of scammers”. And no Sony, a 20 euro voucher so I get to see Ryu kicking some butt wearing his grandma’s underpants does not really cut it.

    I am very very disappointed. To sum it up, I would really like to see Sony come forward and help us regain some of the trust lost but I am not holding my breath for it.