Herding Code 185: Glenn Block on Splunk

At NDC Jon and K. Scott talk to Glenn Block about Splunk.

Download / Listen: Herding Code 185: Glenn Block on Splunk

Show Notes:

  • Intro
    • (00:18) Glenn got a new job at Splunk.
  • What is Splunk?
    • (00:40) Jon asks Glenn what Splunk does. Splunk has a product that gathers operational intelligence. It’s got a data analytics platform which understands a lot of log formats. It can handle streaming logs and has a bunch of API’s. It can index in realtime, handles unstructured data, and has some advanced pattern matching features.
    • (02:12) Glenn talks about some common uses. GitHub and Target both use Splunk. It’s especially liked by IT Admins who can query across multiple servers by timeslice in realtime. There’s a customizable dashboard to surface the information.
    • (03:24) Glenn says that since Splunk has a powerful API, you can push data into it. You can push data in using HTTP or TCP.
    • (04:01) You can teach Splunk to fetch data from a source using their app platform. Glenn talks about an Azure app he built for Windows Azure Web Sites diagnostics.
    • (05:39) Splunk is available in the cloud, but it’s often run on premises. It’s cross-platform. It doesn’t store the data, it just indexes it.
  • Pricing, free versions, cloud hosted versions
    • (06:44) Glenn says the pricing is based on data throughput. They have a free license that gives you 500MB/day, a developer license that gives you 10GB/day for a limited time, a free cloud product called Splunk Storm which gives you 20GB/application for a 30 days, and a new enterprise product called Splunk Cloud running in AWS. The enterprise cloud product is especially useful for AWS hosted apps.
    • (08:20) Jon asks if there’s a planned cloud hosted offering for Windows Azure. Glenn says he’s pushing for it, but in the meantime it’s pretty easy to install it yourself.
    • (08:58) K. Scott asks about what he’d see if he used Glenn’s Azure app on a Windows Azure Web Site. Glenn lists some of the data and sources.
  • Developing Splunk apps and language support
    • (10:03) K. Scott asks about the process of writing a Splunk app. Glenn talks about all the language specific SDK’s they support and describes the process.
    • (11:20) K. Scott asks how they support so many languages in Splunk. Glenn says it’s pretty Unixy in that it works with streams, so all the language specific SDK’s work with that.
  • Using Splunk for evented data, not just logs
    • (12:25) Jon asks about some  real world examples of things people are monitoring. Glenn talks about a recent DSL-like feature called data models, which allows business analysts to search through the data, and graphically pivot on it. One of the places people use that is for monitoring the entire dev lifecycle. Security auditing is a huge use case. 50% of the Fortune 100 uses Splunk. Glenn gives an example of how one of his co-workers wrote a Node app using Firebase’s bus feed to show a realtime map with bus location.
    • (16:00) Jon says this seems to blur the lines between logs and event sourcing. Glenn says it’s not just a log platform, and works really well with evented data.
  • Technology stack
    • (16:44) Jon asks what technologies it runs on, and if it’s using Hadoop. Glenn says Hadoop’s great, but not for realtime. They do have a product called Hunk which can access Hadoop HDFS information, though. It’s mostly C++ and Python (Django). They’ve recently rolled out an app frameowrk which makes it easy to customize Splunk using Django. There’s no database, since Splunk really just maintains indexes to data from other sources.
  • Glenn’s new book: Designing Evolvable Web APIs with ASP.NET
    • (19:25) Jon asks Glenn what he does in his free time. Glenn talks about the book he (and friends) are just finishing, called Designing Evolvable Web APIs with ASP.NET. It focuses on building a real system using hypermedia using ASP.NET Web API.
    • (20:35) Jon asks about versioning: are they using headers, URLs, etc.? Glenn says their argument is based on using additional media types and hypermedia. Hypermedia makes it easier to evolve your API because your clients are following links, not using hardcode URLs.
    • (22:15) Jon says hypermedia sounds great, but developers often want to follow defined links. Glenn says he doesn’t think it as a magical automaton, but both developers and code can look for new links as they’re added.
    • (23:40) Jon says it’s harder to evolve APIs if you’re thinking RPC style, but once you’re focused on resouces it’s easier. Glenn says this pattern has worked great for the web – clients just ignore things they don’t understand. Jon and Glenn say this is similar also to the move from relational databases to document databases.
    • (24:30) Glenn says it’s exciting to finally see some hypermedia APIs coming out: PayPal, GitHub, Amazon’s streaming APIs, and NPR’s recent API updates based on hypermedia.
    • (25:30) Glenn says the book doesn’t try to convince you that this is the only way, just shows the benefits. K. Scott says this sounds really useful to move from the theoretical to some concrete examples.

Show Links:

Herding Code 184: Scott Guthrie on Windows Azure

At NDC Jon and K. Scott talk to Scott Guthrie about his talk Building Real World Apps with Windows Azure, what’s new in Windows Azure, the advantage of provisioning and scaling up and down instantly, and more.

Download / Listen: Herding Code 184: Scott Guthrie on Windows Azure

Show Notes:

  • Scott talk: Building Real World Apps with Windows Azure
    • (00:18) Scott’s talk covered twelve patterns for building cloud apps using things like continuous delivery, transient fault handling, long term failures, etc.
  • What’s new in Windows Azure
    • (01:02) Jon asks Scott to overview the highlights of what’s new in Windows Azure over the past year
    • (01:25) Scott says they generally ship a major release every three weeks
    • (01:40) Scott talks about how they’re using agile approaches to development, and some services update as often as ten times a day
    • (02:19) Scott overviews some of the main things that shipped over the past year
      • Virtual Machine and Virtual Networking
      • Windows Azure Web Sites
      • Auto-Scale support
      • Hadoop
      • Mobile Services
      • Push Notification
      • Media Services
  • (06:37) K. Scott asks how Auto-Scale came to be. Scott Guthrie tells the story about how it came from an acquisition of an Azure startup incubation project. The team joined at the end of March and the feature shipped in June.
  • (08:42) Scott talks about how Azure and Cloud Development help you move faster with illustrations of how quickly you can create and integrate services and infrastructure and support multiple regions.
  • (10:22) Scott talks about the advantages of being able to quickly scale both up and down. He talks about how Troy Hunt was able to scale up Azure instances to crunch through databases of breached passwords to make it easy to see if your password has been compromised, then scaled right back down and spent less than a dollar.
  • (14:09) K. Scott asks about Node.js support. Scott talks about how they’ve been supporting Node for a long time, and how cloud development lets you easily choose between tools for different applications.
  • (15:09)Jon asks Scott what books he’s been reading lately.
    • (15:45) He’s been reading a lot of work related books on things like supply chain management
    • (16:35) Scott mentions the new Web API book by Glenn Block and friends
    • (16:46) He went to Australia and read a book called Fatal Shore, a book about the founding of Australia

Show Links:

Herding Code 183: Semantic Merge with Pablo Santos

The guys talk to Pablo Santos about Semantic Merge, a merge tool that understands your code.

Download / Listen: Herding Code 183: Semantic Merge with Pablo Santos

Show Notes:

  • Intro
    • (00:18) Semantic Merge is a diff tool with a semantic understanding of your code.
  • Language support
    • (01:01) Jon asks about what languages Semantic Merge supports. It currently supports C#, Visual Basic.NET and Java, and they’re currently working on adding support for C, then C++.
    • (02:00) Jon noticed that they’re using Roslyn and asks about that. Pablo says that it worked really well, handling the parsing to allow them to focus on the important things like diff calculation and semantic merge calculation
    • (03:02) Jon asks about support for JavaScript. Pablo says it’s still under development and there’s a lot of demand for it. Since JavaScript isn’t so tightly structured, they’re still working on figuring out how to come up with something really useful there.
    • (04:08) Jon asks about how they handle parsing outside of Roslyn and .NET. Pablo lists the different parsers they use for different languages. They’ve opened up the way that languages plug in, which allowed for a community contributed Delphi parser.
    • (5:33 Scott K. asks about support for Typescript, since it’s more strongly typed. Pablo says that’ll be easier, but they’re working through the language support list in order of demand.
  • What kind of semantics can Semantic Merge understand?
    • (06:28) K. Scott talks about what Semantic Merge does at a high level and asks about the different refactorings Semantic Merge can and can’t understand. Pablo explains a common scenario in which you’d be afraid to refactor code while adding or changing functionality if you know someone else is also working on it. Semantic Merge understands the refactorings so it’s easy to merge the actual changes. What Semantic Merge currently doesn’t handle is multi-file semantic merges, e.g. with code being refactored into another file. They’ve got a working prototype for that, but it’s harder to plug into different source control systems since they handle multi-file merges differently.
    • (08:52) Pablo points out that, while it’s called Semantic Merge, the diff functionality is really useful on its own.
  • The importance of graphical representation of merge issues
    • (09:17) Jon talks about how good the graphical representation is – both really easy to read and just generally nice looking. Pablo says they’ve put a lot of work into that and explains why they’ve designed it as they have.
    • (10:41) Scott K. says that developers are often stuck in a textual viewpoint for diff and merge, but a good graphical representation can be really useful. Pablo says that we’ve seen a recent revolution in source control tools, but we’re still using tools and technologies from twenty years ago. Jon says that the older ways of displaying diff and merge results with plus and minus lines was based on working with the old source control systems and mostly doing two-way merges.
    • (13:25) Pablo says it’s something that you really miss when it’s not there – big merges with lots of files look scary, but when you see that the actual changes are minimal it’s not such a big deal. Scott K. mentions a joke he saw on twitter about how a ten line code review finds ten issues, but a thousand line review passes easily.
    • (15:17) Jon asks how Semantic Merge has changed the way their team develops code, for instance by making them more ready to refactor code. Pablo gives an example with working on a year-old branch in which traditional diff gave him tons of merge conflicts but Semantic Merge only gave him one.
    • (17:19) Jon noticed that many of the samples were able to automatically merge everything and asks how Semantic Merge detects merge conflicts. Pablo explains how Semantic Merge not only is able to detect when changes don’t cause conflicts, but can also detect merge conflicts that other tools won’t find.
  • Version control integration
    • (19:36) Jon asks about which version control systems Semantic Merge integrates with. Pablo lists Git, Mercurial, TFS, Perforce, Sourcetree and Subversion and says that it’ll plug into just about anything because just about all version control systems use common conventions for diff / merge tool integration.
  • Platform support
    • (20:51) Jon asks about their recent Linux support and asks if that’s done using Xamarin and Mono. They use Mono for common backend code, but wrote native front-end code for Linux using Gtk#. They’re currently working on an OSX version using MonoMac, which gives it a true native front-end with a standard Mac look and feel.
  • Pricing model and free licenses
    • (22:52) Jon asks about the pricing model. There’s a 15 day free trial and a monthly subscription for $4/month. They wanted to experiment with pricing to make it so inexpensive that pricing wasn’t an issue. Jon asks if the subscription checking is complex. Pablo says it give you a lot of leeway so it won’t block you if you’re coding on a plane or something. They don’t obsess over security since it’s such an inexpensive application to begin with.
    • (25:36) Jon asks about their free licenses for open source developers. Pablo says they use Mono extensively and have been offering open source licenses for Plastic SCM for a while. Pablo mentions some of the open source projects using Semantic Merge, including F-Spot and a lot of other Mono projects.
  • Semantic based insights
    • (26:57) Jon asks they could use their information about semantic changes to source code over time to offer other insights to developers. Pablo says that this is something they’ve been doing with Plastic SCM with features like semantic method history, so you can track changes to a method over time across renames, refactoring to other files, etc. They also can offer richer metrics, so you don’t just see lines of code changed but can understand methods changed, refactorings, etc. Their goal for a long time has been to transform version control from a delivery mechanism to a productivity tool for developers.
  • Plastic SCM
    • (29:04) Jon asks how Plastic SCM compares to other version control systems. Pablo recommends going to PlasticSCM.com and look at the branch explorer. It’s as powerful as Git but very easy to use. It’s fully decentralized. It’s very graphical, and you can do almost everything from the branch explorer. It integrates well with enterprise security with support for things like ACL’s. It (of course) offers support for a lot advanced merge scenarios. It’s been under development since 2005, they’re in version 5 right now. It’s free for every team under 15 developers.
    • (31:25) Jon asks if there’s a way to test-drive Plastic SCM against an existing Git repository. Pablo explains how to do that without changing version control systems, since Plastic SCM can natively use the Git API.
    • (34:25) K. Scot asks about an old blog post about a small Windows Git application client; Pablo says that’s no longer required as it’s built into Plastic SCM.
  • Wrap up
    • (34:55) Jon asks about where listeners can find out more about Semantic Merge and Plastic SCM.
    • (35:50) Jon mentions that he really likes the team page on the Plastic SCM site – all the faces follow the mouse cursor as you move it around. He’s easily amused.

 

Show Links: