Commercial data warehouses were something of a struggle. Massive data volumes weren’t handled with ease, and the whole thing was a relative mess. Companies have been trying to solve this problem for a long time, though few came close to providing any real solutions. Amazon has changed all of that with Redshift.
What Redshift Is and Does
Redshift is a really powerful data warehousing tool created by Amazon to simplify the data warehouse process. This tool is especially ideal for any medium-sized data business, and it’s one of the few tools that companies across the board are starting to adopt.
Setting up a Redshift account is simple enough too. Just define a few high-level parameters, add node quantity and size, and Amazon basically does the rest of the work for you. Once your personalized cluster has been built, it’s up to you to set preferences and build out your own database. Amazon doesn’t want you to have to mess around with settings and software either, which is why Redshift databases are set up like an actual service (no need to access a command line). To say the least, Redshift is user-oriented. As with any other tool, though, there are some drawbacks to Redshift.
Amazon really shines with it comes to Redshift’s UI. You’ll see all kinds of graphs and charts that really put data into perspective. Queries are nicely graphed in addition to performance data, and that makes it easier to grab information and details in one quick glance.
You can also look at a complete list of queries by tabbing over a query ID, which will bring you to a link (all IDs are hyperlinked). This is simplicity defined.
A Few Disadvantages
The main problem with Redshift (and this bodes true with most Amazon products) is that you really have to use Amazon’s recommended software. In this case, that software is SQLWorkbenchJ. If you decide to use other software, you may run into a few hiccups along the way. The other drawback is that Redshift relies on columnar databases, so you’ll need a crash course here if you’re not familiar with this type of database design.
Lastly, some users may find it frustrating that bulk data loads cannot be completed through the local file system. This happens because access to the server back-end is denied. You do have direct access to data stored in S3, though, so that helps. Overall, Redshift is just the right kind of solution to the data warehousing problem.
Where other companies have tried and largely failed (albeit, there are some other options out there), Amazon’s Redshift really takes over. What will it cost you to use Redshift and take advantage of this tool? Redshift has “no upfront costs” according to Amazon. This is a pay-as-you-go tool.
Instead of charging users one rate, Redshift prices are billed according to “…an hourly rate based on the node type and the number of nodes in your cluster.” If you want to check out Redshift without a huge commitment, you can go for a single 2TB data warehouse (single XL node) at the rate of $0.85 per hour.
Need more information? Got questions about Redshift? Ask away!