During the past week or so, I've been laying more groundwork for my web application sandbox. Although it's still very spartan in style and spare in content, the sandbox landing pages (http://bradleypmartinsandbox.com) are now quite a bit cleaner. You can check out a screengrab of one of the pages below, or head to the site yourself to take it in! I've also built in functionality for the sandbox page to link to other servers hosting specific applications. Right now there's only one bare-bones app being served (linked in the greet button of Figure 1), but I'm hoping to expand this selection soon (and with much more substantial applications).
Figure 1. Here's a shot of the cleaned-up 'secret' page from my last post (now in the 'appselection' route). It now serves as a link to other web apps I've built, but you still get a kitty picture!
I'll try and involve many of these JS tools as my sandbox/portfolio applications become more substantial.
Figure 2. A D3 interactive cryptocurrency display allows users to select a specific date range for further exploration, and to choose between several coins for visualization.
Figure 3. This clone of a noteworthy, animated presentation by Hans Rosling of gapminder.org shows relationships between GDP, life expectancy, and total population (circle area) for different countries (individual circles) on different continents from the year 1800 up to the present. D3 and its community make a ton of cool data-driven functionality possible, and I'm looking forward to using that in upcoming projects.
More progress in practical app development: linking together a new domain name, recent tutorial work, and EC2 instances
One web development 'next step' I was looking forward to (after completing a whole bunch of back-end tutorials and one whopper of a full-stack tutorial mentioned in the last post) involved branching out by deploying a couple simple apps on my own via AWS EC2 instances.
The end of Colt's Web Development course had students setting up a deployment of the YelpCamp project through Heroku.com. I thought it was neat to watch those videos and learn about another option for deployment, but since I've already done a bunch of learning in the AWS ecosystem, I thought it would be fun (and practical!) to start blazing my own little trail by deploying on AWS instead.
At the same time - and since I'm planning to be pumping out quite a few more projects in the near future - I thought it'd be a good time to register a new domain name for exhibiting a budding portfolio. So, using AWS's Route 53 service, I got hold of bradleypmartinsandbox.com (no prepended 'www') for hosting my sandbox of new projects. [Note: for the next few weeks, it may be something of a crapshoot as to whether or not you can follow this link and find anything interesting. I'll probably be putting it up and shutting it down a lot while I figure out how I want it to function for the long term. I'd like to reach a state where it's highly available and can route users to a variety of project showcases. More on that in the future!]
I enjoyed doing my own Googling to roll through the process of configuring an EC2 instance to host one of the applications we made during the web development bootcamp. Loading up an Ubuntu image with many of the required packages for the simple user authentication app (pictured below) wasn't too much of a chore, especially with the help of the package.json file to help manage dependencies. The densest new material (though not that bad, either) was learning how to use NGINX to route HTTP traffic to port 3000 from port 80 (so I wasn't routing in user traffic with sudo privileges!) and making the slight changes to the app.js file to accommodate this protocol change.
There'll be much more to share soon. Thanks, as always, for reading, and enjoy a screenshot of today's HTTP interaction with my AWS EC2 web app.
Figure 1. Screenshot-ception! This is a capture I took with a basic user authentication app served up at my new sandbox location (bradleypmartinsandbox.com) through AWS EC2. MongoDB handles the data for user authentication, and user login simply routes to a 'secret page' where I've spat out some unstyled HTML and a link to a recent project screenshot (that you can probably find referenced a few blog posts ago). I'll look into HTTPS routing and user-friendly persistence of the sandbox domain soon (along with adding more interesting projects, of course!).
As of the post last week, I'd been exploring a number of database technologies and had set up a simple serverless chat application via AWS Lambda (thanks to the help of Frank Kane and Brian Tajuddin's udemy.com course on the subject). At the end of the post, I briefly mentioned wanting to explore more front-end stuff (such as the HTML, CSS, and JS mentioned in this post's header) so I could move toward having more agency in creating web apps that are completely my own.
Toward that end, I threw myself into Colt Steele's Web Developer Bootcamp (also on udemy.com) and was blown away by how much value I got out of it. Not only did I get a ton of practice with HTML, CSS, and JS, but also I got a bunch of exposure to even more backend tools, conventions, and languages like Node.js, Express, Mongoose, RESTful routing, and MongoDB. I still have a ton to learn in the web development space (and probably always will!), but feel much closer to being confident in spinning up my own deployable (and hopefully functional!) apps.
I think this is as good a time as any to lay down a tangible "pivot point" in my development as a software engineer. I love taking courses and learning new stuff in a structured environment, but I think it's equally (if not more) important to be starting up my own projects and getting them deployed. Stay tuned for another post about milestones, challenges/successes, and progress during the next week. Until then, I'll share some screenshots of progress on the YelpCamp app we built as a final/cumulative/ongoing project in Colt's web dev course.
Source code from my time in the course can be found at my GitHub location (https://github.com/bradleypmartin/webdevbootcamp). Different versions of the YelpCamp project code are in there; the photos below are from the YelpCampFinal build. There's still some more functionality that Colt and his TA's have been adding over time (and that I have yet to put in the app) like fuzzy search, password reset, interaction with Google Maps API, etc.... and when I include some of those extras, they'll be included in the Final build as well.
Figure 1. Landing page for the YelpCamp website. The CSS for this landing page was pretty cool to work through, as it involved the display of a transitioning slideshow of 5 photos (one of which is shown in the screengrab here).
Figure 2. Index page for the YelpCamp website. In the top-left corner of the photo, you may be able to pick out hints of authentication/authorization functionality, which was indeed a cool part of this course. The material we covered here was a good complement to the AWS Lambda project discussed in my last post (where we were using Cognito to take care of authentication).
Figure 3. The YelpCamp comments page was an excellent exercise not only in additional functionality of the website, but also in nested application of RESTful routing (for Create/Read/Update/Delete operations on comments for each campground).
A functional, 'serverless' chat application running on AWS (courtesy of S3 and Lambda functionality)
Hey there! So, toward meeting my goal of having a functional, bare-bones web application (with a Cassandra backend) up and running by today, I explored several execution options. And, as it turns out, I didn't quite accomplish that aim. But I got to something close, and I learned a lot of cool new stuff in the process!
What I did end up doing was creating (as in the title of the post) an AWS Lambda-based chat application that runs on an Amazon DynamoDB NoSQL database (rather than Cassandra) backend, thanks in large part to a great udemy.com tutorial by Brian Tajuddin and Frank Kane. In addition to the whole, hip concept of 'serverless' deployment (and DynamoDB), I got to learn about many new AWS technologies including the API Gateway, S3's built in capacity to host web traffic, Cognito user authentication and management, and CloudFront CDN functionality.
Thanks, Frank and Brian! Some pictures of the end result (deployed application) follow.
Figure 1. The serverless chat application sign-in screen.
Figure 2. A window where you can start chats with other verified members of the chat app. I believe Frank is an enthusiastic fan of Star Trek!
Figure 3. Here I'm having a conversation between two accounts I created to show an example of the centerpiece use case (chat, of course!) of the app.
After finishing up the udemy.com Cassandra intro course described in the last blog post, I've been eager to learn more about distributed data administration. A natural next step was to proceed over to datastax.com and to commence wolfing down several of their own free (and very useful!) Cassandra tutorials.
The DataStax tutorials are building quite effectively on the udemy.com introduction material. Where before I was setting up communication between one or several virtual Ubuntu instances on my local Windows machine, the DataStax sequence has me working among AWS EC2 instances as well, which is pretty cool! Their 201- and 220-level courses had me coding through exercises on just one EC2 node, and today I started linking up 3 t2.medium nodes to tackle some of their 210 course (enterprise operations) problems. This presented me with a whole bunch of rewarding challenges back-to-back:
1) linking up S3 buckets with EC2 nodes;
2) using the S3 bucket to transfer and install a tarball distribution of DataStax Enterprise (when a direct connection between the EC2 instances and DataStax was giving me some trouble);
3) negotiating healthy topology of a small Cassandra cluster through conditions that differ slightly from DataStax's current documentation for their 210 course; and
4) troubleshooting the bootstrapping of an individual node in the cluster (I think my running a stress test on that node in isolation before adding it to the cluster gave me some problems!).
At the end of the day, I was able to get all three nodes up and running normally. As I may have mentioned before, I've got a bunch of ideas for web application projects that I'd like to start building, but I'm trying to take manageable 'baby steps' here first. By the end of the week, I'd like to have at least an absolute bare-bones web server up and communicating with a Cassandra cluster.
Figure 1. I finally cleaned up my final AWS EC2 node enough to keep it from hanging during the Cassandra bootstrapping process. Yay! As you can see, I tend to run nodetool status/dsetool status a lot.
Hey again! I got a few other things done early this week that kind of fit into the category of the last post's topic. Two fun developments were 1) a bit of work in Apache Zeppelin (for prototyping/viewing 'big data' analysis in a notebook environment), and 2) a crash course in the Cassandra distributed and non-relational datastore technology.
The Cassandra tutorial was especially fun because within a few hours of starting the course, I was (admittedly, with my hand held through the process!) creating and updating a mock vehicle tracking web application facilitated through Java/CassandraQL created/hosted in a virtual Ubuntu environment. It looks like soon I'll be spinning up several virtual machines to test out scalability on a small Cassandra cluster. It's fascinating stuff, and I'm looking forward to coding up my own projects in similar spaces.
A couple of screenshots (from the tutorials above) follow.
Figure 1. This is actually just output from a ready-made tutorial script for running and analyzing a Spark job in Zeppelin, but still, the results were cool to see! I like that this kind of notebook environment is available. While it's not something you'd run in production/deployment, I enjoy the option to fiddle around graphically with code and objects during early prototyping of a project (sort like the Python/Jupyter relationship).
Figure 2. Here I'm completing a code-along creation of a Java/Cassandra-based web application that updates and displays (via interaction with Google Maps) vehicle tracking data. Fun stuff! I'm looking forward to deploying database tasks to a cluster of several virtual nodes in the next few days. Thanks to Ruth Stryker (of Infinite Skills) for the great udemy.com course I'm taking to learn about Cassandra!
Hi all! In preparation for some upcoming interviews, I've been working diligently through a variety of tutorial courses at udemy.com. It's been an exciting and enriching time, and I'll share some specifics (with screenshots) below. A subset of the coursework/progress can be found in a variety of new repositories at my GitHub location (github.com/bradleypmartin).
Focus area 1: A basic image processing algorithm written in Python
A lot of the items to follow (big data tutorials; core coding in new languages) were sort of 'guided tours' through the various spaces. This mini-project, though - for whatever it's worth - was totally my own creation! I've been thinking a lot about image processing lately, and thought it'd be neat to come up with a (naive) algorithm that can reconstruct a tiled and scrambled image.
This algorithm cuts up an image into a user-defined set of rows and columns, gives some "overlap" pixels to each one of the resulting subimages, offsets each subimage by a (randomly-chosen) smaller amount of pixels, and then scrambles the subimages' order and removes any associative data between them.
Next, the 'fun part' commences by initializing and then filling a graph of subimage relationships based on pixel similarity, followed by (attempted!) reconstruction of the original order using the associative graph. In the GitHub repo for this project (ImageReconstruction) there are three separate .jpg files I've tried so far with the algorithm, and it works pretty well on these (usually 100% reconstruction accuracy) for small numbers of 'cuts' (maybe around 15-25 tiles, total). Finer divisions of the photos tend to result in a lot more computational overhead and reduced accuracy of the reproduction (fairly quickly so, too, as you increase the number of subimages past about 20-25).
Future work could involve any or all of the following:
Figure 1. My cat, Chocko, offered her modeling talent for the image processing project. I liked the Jupyter notebook environment for preliminary work on the project due to its easy visualization of various steps/modules-in-progress.
Focus area 2: 'Big data' tutorials and exercises
'Big data' tools and technologies are in huge demand at many of the places where I'm applying to work, and I'm pushing forward with a lot of tutorial material covering Spark (and other parts of the Hadoop ecosystem) access and manipulation of large, distributed datasets, MySQL creation and query of databases, Cloud9 development and AWS cluster deployment of big data algorithms, etc.
It's a lot to take in over a short period of time, but I'm very glad to be doing so; there's so much potential for engaging in cool new projects with all these new tools! A couple screenshots of my early progress are below.
Figure 2a. Here's a screenshot of my starting MySQL development environment in AWS Cloud9. I'm looking forward to creating (well, following along in creating, at least!) a web application associated with this course today (4/30/18).
Figure 2b. This was a fun process: I had spun up 5 m4.xlarge nodes on AWS to run some analytics on a 1 million-member movie ratings dataset as part of one of Frank Kane's online tutorials. I was using a Python/Spark interface here.
Focus area 3: Other new programming languages and development environments
Along with the space-specific 'big data' tutorials and exercises, I've also been pushing forward in broadening my proficiency in new languages and environments. Matlab and C++ have served me well in the past, but I wanted to start branching out as well. Scala, Python, and MySQL have been a big part of the efforts described above, and I've also been working on some core competency in Java. All the exercises thus far have brought me through a whole new wealth of text editors and IDEs (each with their own quirks and benefits - for example, I love how Git version control is baked right in to many of them!).
You can see some examples of Python scripts and notebook work both in the ImageReconstruction repository mentioned above, as well as in the SparkCourse repo. I've got some Scala code in the ScalaAndSparkCourse repo, and my core Java work is in the JavaCoreExercises repo.
Some of these repositories are more well-developed than others, but I hope to keep building on each as time permits.
Figure 3. I bet you haven't seen a 'Hello World' implementation like this before! But honestly, there's some more interesting stuff (classes/inheritance, exception handling, etc.) in my associated JavaCoreExercises repository.
I've recently made a lot of research and extracurricular code available at my GitHub location (https://github.com/bradleypmartin). Some of it is currently in more of a "data dump" format, so look forward to my branching and cleaning up a lot of it over the next days and weeks! There are some Matlab and C++ examples in there now, with basic Python notebooks to follow. Topics explored in these collections include my Ph.D. and post-Ph.D. research into numerical solution of PDEs, coursework in general inverse and multigrid methods, and exploratory work in data science. Enjoy!
An article in the New York Times this weekend (along with the significantly increased volatility of U.S. equities during the past few days) got me thinking more about market dynamics, and I thought I'd try to replicate (as closely as possible with the resources and time I've got today) the results of the Times.
The illustration the Times made was that cumulative equity investment returns over the past 20-25 years can be disproportionately attributed specifically to the overnight gains in asset value (as opposed to intraday increases). The Times presented a plot of cumulative returns of the SPY exchange-traded fund (ETF) (NSEEARCA:SPY) since its inception (around 1994) in two different flavors:
1) assuming the shareholder had bought shares of the ETF at market open and sold at market close every business day; and
2) assuming the opposite scenario: that the shareholder had bought near market close every day and sold at market open.
Although it may not be news to those familiar with inter- and intraday equities trading, over the lifetime of the SPY ETF, there has been a substantial advantage to holding overnight. I found this kind of surprising! To have a look at the phenomenon myself, I loaded up my Quantopian account (www.quantopian.com) and did some of my own backtesting.
Below, I've copied-and-pasted results of Jan 2002-Jan 2018 investment strategies matching the two schemes above (Quantopian's backtesting routine won't let me go all the way back to 1994). Although the timeframe here is somewhat different than that presented by the Times, we see similar results: an exclusively overnight long position in SPY seemed to present a distinct advantage over an exclusively daytime hold.
Figure 1. Simulated returns of an initial $100,000 invested in the SPY ETF in January 2002, ending in January 2018. Here, as many shares as possible of the ETF are purchased right before every market close and sold again at the next business day's market open. Although returns with this strategy are somewhat less than a simple buy-and-hold, we can see that the shareholder did indeed enjoy most of the index's gains (141% cumulative return in the overnight strategy vs. about 209% for buy-and-hold). No commissions were accounted for in this simulation or any others in this post.
Figure 2. Simulated returns of an initial $100,000 invested in the SPY ETF in January 2002, ending in January 2018. The strategy in this example is the opposite of the one whose results are shown in Figure 1: here, shares are bought at market open and sold at market close. Over the ~16 year period here, the shareholder still makes gains in equity, but not nearly to the extent seen above.
After seeing these results, I thought it'd be neat to check out how the overnight edge manifested in a couple of other cross-sections of market history. For the first section, I thought I'd rerun the two backtests above on the interval between January 2002 until January 2010 (shortly after the financial crisis late in that decade). These results are shown below.
Figure 3. "Overnight edge" strategy backtested on the SPY ETF between Jan 2002 and Jan 2010. The strategy did pretty well here vs. the index benchmark (also SPY)!
Figure 4. A market-day-exclusive alternative to the overnight edge strategy is tested here on the SPY ETF between Jan 2002 and Jan 2010. Statistics of this run (seen in the printout) are significantly worse than the overnight strategy across the board.
The overnight strategy simulation really performed well during this earlier time period! Finally, I thought I'd test both strategies out on the span between Jan 2010 and Jan 2018. My results are below.
Figure 5. Jan 2010-Jan 2018 backtest of the overnight edge strategy on SPY.
Figure 6. Jan 2010-Jan 2018 backtest of the daytime strategy on SPY.
In this more recent timeframe, a significant amount of cumulative equity gains seem to start shifting toward intraday market dynamics (though the overnight gains still appear to have an edge in this particular simulation). I wonder: what factors in action during the more recent bullish market might have caused such a shift? And how can I best incorporate today's overnight gap trends and correlations into an efficient trading algorithm?
The article and results certainly gave me a lot to think about, and I'll share more algorithmic trading results soon!
Figure 1. Heat map within a 2D domain featuring an insulating triangle. Temperature is high (value 1; red color) near the top of the domain (Dirichlet boundary) and cold (value 0; blue color) near the bottom. The insulating triangle in the middle of the domain creates an intriguing temperature profile in the elliptic (equilibrium) problem results shown here.
A lot of my dissertation work (and papers that branched out from it) involved numerical solutions to diffusion and wave problems in which model parameters (density, wave speed, thermal diffusivity, etc.) change suddenly between two or more different materials. Many of the test cases I presented dealt with mildly-curved interfaces, and although I presented a few cases with cornered interfaces, I didn't have a high-order (highly accurate) approach prepared and vetted for the latter scenario.
Over the past month or so, I've been working on incorporating some singular basis functions into radial basis function-generated finite difference (RBF-FD) collocation schemes in a manner that significantly increases the accuracy of elliptic and parabolic (diffusion) problems with cornered interfaces. I've also been combing through existing literature to see where this approach may fit. It seems as though a number of investigators have used similar methods and analysis for elliptic (equilibrium) problems before, but the number of articles covering parabolic (time-dependent) problems is far fewer.
I've had a decent amount of success so far in both the elliptic and parabolic cases, but a lot of challenges remain. One exciting aspect of the approach I'm using is that it would translate fairly readily (at least in theory and on paper; I'm not sure at all about its stability) to simple hyperbolic problems of recent interest (acoustic and electromagnetic wave transport). Keep a lookout for more updates on this soon!