• Creating duplicate InfluxDB databases/datasets in InfluxCloud using SELECT INTO

    Creating duplicate InfluxDB databases/datasets in InfluxCloud using SELECT INTO

    InfluxCloud Chronograf currently doesn’t provide a fast and easy way of creating duplicates of your existing databases. Their support (which comes with the paid service) is sadly also very slow and it’s a bit of a chore to make them do anything for you within a reasonable timeframe unless it’s something supercritical.

    So, if you’re in need of a way to create a backup and re-create a DB multiple times because you’re looking to test some destructive actions on your data (such as updating records) and you’re looking to avoid dealing with the “support”, here’s a hacky way to deal with it which might end up being the faster thing to do.

    Meet Mr. SELECT INTO. What SELECT INTO aims to do is to help you out in copying all your data from all the measurements you might have into a newly created database (has to be pre-created before running the query). It also aims to keep your existing data structure (measurement names and what records go where). In its simplest form and in ideal circumstances the query that you’ll need to execute will look like this:

    A few details that are not so great about the vanilla SELECT INTO approach:

    1. Continuous Queries will not be copied over.
    2. If you try to run the SELECT INTO query from the Chronograf “Data Explorer”, and you have a dataset which is not trivially small – the Chronograf request with your query will timeout after 60s and the query will simply stop. It won’t even continue running in the background.
    3. Data types of some values might change for no apparent reason.. For example INT columns (values) might get converted to FLOATs upon data copying although all the data stored in these columns WAS initially actually INT data.. (?!) This is interesting because it might come back later and bite you once you try to execute aggregate functions to backfill some of the downsampled measurements with new/fixed data. What exactly is the issue with that? For example the SELECT INTO query might not convert INT’s to FLOAT’s for downsampled measurements (which are typically populated by CQ’s) but only for the “lowest resolution” measurements (measurement you insert the data into explicitly). So after your happily copied over all the data, if you try to insert a max(thought_it_was_an_int) from the default measurement into a downsampled one it will now try to store a FLOAT into an INT column and this will obviously return an exception. Happy days.

     

    What this means:

    1. Probably not a big deal if you just want to mess with the data, but keep it in mind.
    2. You’ll have to break your data down into smaller datasets to copy over (for example into one day data chunks using a WHERE time … clause). You’ll probably need an SDK library (such as InfluxData.Net) to automate the process.
    3. You’ll have to ditch the wildcard SELECT from the vanilla example and manually select (and explicitly cast) all the columns that you want to copy over to the new DB. This will make the process slightly more complicated but at this point you’re probably automating everything through a small script / migration app anyway so it shouldn’t be too much additional work.

     

    So, here’s a proof-of-concept code sample that explicitly selects columns to copy over, forces casting to INT data type (or keeps the data type intact), selects specific source and target measurements (tables) and selects only a single day worth of data. You will have to iterate through 1-2 loops to inject your desired source and target table names and time ranges to cover all the data that you’ve got in your DB.

    I really hope this helps and saves you some time. If you have any questions, feel free to leave a comment.

    #FastNotFastButMaybeFaster

    Cheers!

  • Downloading/streaming Azure Storage private container blobs to AngularJS through .Net WebAPI

    Downloading/streaming Azure Storage private container blobs to AngularJS through .Net WebAPI

    When our Azure storage contains files that are meant to be publically accessible, it’s pretty trivial to deliver them to the end-user. We can either embed such items (e.g. images) or simply add links which point to them (e.g. pdfs) because Azure provides direct links to them. But what happens when these files contain sensitive data that is not meant for just anyone? Perhaps some kind of reports?

    Well, it gets a bit more complicated.. Since these files don’t have publically accessible URI’s any more, there are several steps which we need to go through:

    • Authenticate against Azure using SDK
    • Load the file into MemoryStream
    • Deliver the stream to the client (browser)
    • Convert the byte array into an actual file on the client-side and simulate “downloading”

    Since we’ll need more than just the MemoryStream, we’ll wrap it together with the file metadata into a model object.

    We will use AzureProvider class to authenticate against Azure, download the file from Azure and to create the model object.

    Our actual controller will inherit from this BaseApi class which contains a custom IHttpActionResult method which we can name AzureBlobOk. This is something pretty reusable so it’s good to have it at hand in your base class. What it does is it sets up all the content headers and it attaches the stream as the response content payload . It also returns HTTP status 200 which means OK – everything went fine.

    The actual controller is pretty simple..

    On the client side, we’ll need the following service to actually convert the byte array that we got from the API into something meaningful. I tried various approaches, but in the end decided to use FileSaver.js which “implements the HTML5 W3C saveAs() FileSaver interface in browsers that do not natively support it”. What it will do is turn the byte array into an actual file and prompt the user to download it.

    This service can easily be consumed by injecting it into your AngularJS controllers and calling the .getBlob() function which will do all the heavy lifting for you.

    Hope this helped, enjoy! :)

  • PdfSharp/MigraDoc to Azure Storage in-memory upload

    PdfSharp/MigraDoc to Azure Storage in-memory upload

    From my (somewhat limited) experience PdfSharp/MigraDoc seems like a pretty fine and powerful library for creating .pdf documents, but it’s documentation – not so much. It’s a bit all over the place and with multiple different NuGet versions/builds and outdated StackOverflow code samples not really helping the situation.

    However, creating a .pdf document in-memory and uploading it straight to Azure is not really that complicated. When might this be useful? For example when you need to generate a report but instead of immediately giving it to the user it just needs to get stored for later access.

    Magic word we’re looking for is MemoryStream. We’ll use two classes – one which will take a MemoryStream and upload it to Azure (AzureProvider.cs), and another one which will create a very simple MigraDoc document (ReportProvider.cs) which you can then build upon and then feed that document to the AzureProvider in the form of MemoryStream.

    The code is pretty straightforward and looks like this:

    Somewhat related – in the next post I’ll explain how to stream a file from Azure Storage private container through .net WebAPI to an AngularJS app.

Back to top