I explore ETags and how they can be used in cURL commands when interacting with Oracle REST APIs. I also discuss some of the performance benefits of using ETags. This is not exhaustive, but I hope it introduces you to ETags or reminds you of their existence! But first…
LATE-BREAKING NEWS!!
A related video
FYI: I reference a CSV_DATA table throughout this post. We use it pretty extensively in this LiveLab. And we just recently presented a webinar based on that same LiveLab. You can check that out below!
Don’t know what ETags are? No worries, here is a definition:
The ETag (or entity tag) HTTP response header is an identifier for a specific version of a resource. It lets caches be more efficient and save bandwidth, as a web server does not need to resend a full response if the content was not changed. Additionally, etags help to prevent simultaneous updates of a resource from overwriting each other (“mid-air collisions”).
If the resource at a given URL changes, a new Etag value must be generated. A comparison of them can determine whether two representations of a resource are the same.
ETags can help to guarantee the provenance of your resources (like the auto-REST enabled table you’ll see shortly) but they can also ensure your applications consume fewer server/database resources, and load comparatively faster.
To illustrate how ETags work, I did some tinkering with cURL commands, ORDS, and a Podman container. Read on if ye dare…to see what I discovered!
Oracle REST APIs and ETags
A couple of weeks ago, I noticed in the cURL documentation there was support for ETags. And the cURL docs have options for both --etag-save and --etag-compare (practical examples to follow). When you use these options in your cURL commands, you’ll either:
save an eTag to a separate text file (locally, like on your desktop in my example below), or
compare the ETag (in that existing file) to an ETag that belongs to your REST-enabled resource (the CSV_DATA table, which you’ll see in a second)
Oh, that’s a lot of words! So read it again and then continue with my walkthrough. Meanwhile, I’ll spin up this Podman container.
We are back in Podman.
INFO: Want to learn more about using Podman and Oracle database tools? Check out my other two Podman-related posts here and here!
ORDS in Standalone mode
I need ORDS up and running for this demonstration, so I issued the ords serve command in my Terminal. This will launch ORDS in standalone mode (using a Jetty server, as seen in the image). Once it’s initialized, I can log into SQL Developer Web to interact with my database (remember, in this example, it lives in a Podman container).
Here, I’ve logged into SQL Developer Web as a non-ADMIN user (ORDSTEST in this case).
From the Database Actions Launchpad, I navigated to the SQL Worksheet.
And to keep this quick, I reused a table I created for that webinar we just did. I also auto-REST enabled it (so I could play with the cURL commands). Below, you’ll see it’s just a quick right-click with the mouse.
FYI: As a reminder, if you want to learn more about cURL commands, check out the LiveLabs workshop that this is based on. You can find that here.
Getting the cURL Command
Once I auto-REST enabled the CSV_DATA table, I selected the GET ALL REST cURL command.
This is the cURL command I’ll use for this experiment.
At this point, I still wasn’t sure that an ETag was sent from the server for those auto-REST-enabled resources (in this case, the CSV_DATA table). I know they are present when you build your own REST Modules with ORDS; (at the time) I was just less confident about the auto-REST resources.
SPOILER ALERT: ETags are present for auto-REST-enabled resources too (I'm dumb, and this is pretty widely known)!
–etag cURL options
Once I knew ETags were accessible for auto-REST-enabled resources, I experimented with cURL‘s --etag options (you’ll see how I implemented these in the upcoming cURL command).
The --etag-save [filename] and --etag-compare [filename] options work such that when you issue the --etag-save in addition to that initialcURL command, a single-line file will be saved to the directory you are currently in (you’ll see that file shortly).
This differs from how an application might work, but the concept is the same. You’re storing the ETag’s value somewhere accessible to the application. For my purposes, I need to keep this ETag somewhere the cURL command line utility can find it.
The initial cURL command
I hopped over to my Terminal and used that [slightly modified] cURL command (the one I previously retrieved from the SQL Worksheet). You’ll see that I included additional options/arguments:
--verbose
--etag-save
| json_pp
This is the first cURL command I issued.
FYI: Apparently, the json_pp command utility is a part of Perl. I think this ships with the macOS, but I'm not 100% sure. Do you know? It worked for me and pretty printed out my JSON response (notice how I used the pipe "|" in addition to the actual command).
When you use that --etag-save option, a file with the value of the ETag will be saved locally. You can see me retrieving that file and reviewing the ETag file (note in the above cURL command, I named the file “myobjectetag.txt“).
Listing the files in the current directory.Locating the myobjectetag.txt file.Opening the file and inspecting the ETag value.
I can now use this ETag in subsequent GET requests to determine if the resource (the CSV_DATA table) I’m requesting has changed since I last interacted with it. What would constitute a change? Maybe rows have been updated or removed; perhaps an additional column was added. Or maybe the table was restructured somehow; it could be any change.
But, let me pause briefly and explain the --verbose option.
About the verbose option
The printout from the --verbose option.The remaining JSON object is nicely printed out.
I used the --verbose option to inspect the information available when interacting with this Oracle REST endpoint. I don’t need to include it now since I know the ETag is coming through, but I left it in this cURL command example so that you could have a look yourself. You’ll see loads of information, including (but not limited to):
Connection information
The cURL version used
The Status Code returned (200 or OK in this case)
ETag info
In this example, all I care about is the presence of an ETag. I can now use that ETag in a subsequent GET request to determine if the resource on the server side has changed. Here is what the cURL command looks like with the --etag-compare option:
That cURL command looks very similar, except for that --etag-compare option. In this situation, cURL first checks to see if your ETag and the resource’s (the API endpoint on your server) ETag match. If they do, the request stops. And if you use the --verbose option, you can see what comes back from the server:
A whole bunch of nothing. Not really, though. That “If-None-Match” Header is the secret sauce, though. That is a conditional Header that is passed over to the server. Essentially it says, “If this Header value doesn’t match yours, then send over the requested data; otherwise, end the request here because we already have the information we need/. It’s stored/saved (presumably) locally.“
INFO:Read up on caches, because that's essentially what your application is going to use instead of having to go through the entire GET request/response cycle.
The request is terminated, but what does this mean from a performance perspective? Well, say you have a webpage that loads and later reloads in response to a user’s interaction (I simulated this with the two cURL commands). That page will probably need some information from the server to populate that page. In a situation like this, you could first ask your application to share your copy of the ETag with the server in a subsequent GET request header (“If-None-Match“). And if nothing has changed, you could speed up page load times by just refreshing with what you have stored in a cache while freeing up resources on your server for other processes. But this is just one example.
Possibilities with ETag
I’ve given this some thought, and I bet there are quite a few use cases where referring to an ETag before executing an HTTP method (like a GET or GET ALL) might be helpful.
You may want to periodically check to see if a resource has changed since you last interacted with it. Could you incorporate ETags into your build processes or longer-running jobs (maybe something around data analysis)?
Actually, ETags play a massive role in JSON-Relational Duality Views. We have an entire section in the ORDS docs on how to use them! And suppose you want to download a containerized version of the Oracle database 23C (the one that supports JSON-Relational Duality views). You can do that via this link (I think I should do this too and highlight some of the cool ORDS + JSON Duality View features)!
Well, this brings me to the end of this post. I’m hoping you learned something and came away with some good resources. And if you found this post helpful, please pass it along! And don’t be afraid to comment too! I’d love to hear your thoughts. Maybe I’ll even include your idea in a follow-up post ๐คฉ!
The title says it all. I’ve run through this about ten times now. But I’ll show you how to start a Podman container (with a volume attached) and install ORDS on your local machine. And then, once installed, we’ll create and REST-enable a user so that the user can take full advantage of Oracle REST APIs. (aka ORDS). I’ll finally show you how to log into a SQL Worksheet as that new user. Ready? Let’s go!
Oracle Container Registry
First, visit the registry. Navigate to the Database product category, then select Enterprise. Even better, just navigate to the 21cor23ai images directly (thanks Killian ๐! Ensure you’ve also signed into the site (otherwise, you won’t be able to pull this image).
For Podman, I’ll review the preferred way to start this container (with a volume; for persisting your data across sessions).
Volumes
Start your Podman machine with the podman machine startcommand. Then create a volume with the podman volume create command (that way, you can save data locally and use that volume each time you start your container). Now that we have the volume, we can create a new container and attach that volume simultaneously (more Podman volume info here).
There are a few ways you can attach volumes when starting a container, but I like the one I found in this video:
Ironically, this is an Oracle video. But it is one of the most straightforward ones I found. To start the container, here is the command I used:
podman run -d --name entdb213 -p 1521:1521 --mount=type=volume,source=entdb213vol,destination=/opt/oracle/oradata container-registry.oracle.com/database/enterprise:21.3.0.0
About port mapping
You’ll notice that I used the following port mapping -p 1521:1521. You can remove that leading 1521. If you do, Podman will bind any exposed port to a random port on your host (a MacBook, in my case) within an ephemeral port range. Ephemeral?? (I guess that means all/any available ports, assuming none of these).
At this point, I have created a volume and started a container (with an Oracle Enterprise database inside).
PRO TIP: If this is your first time starting a container with a database of this size, it will take several minutes. So go do some chores while you're waiting ๐คช
Altering the db password
I’ll change the database Administrator password to something I can easily remember (“oracle“) using this command:
# Original command
# docker exec <oracle-db> ./setPassword.sh <your_password>
# My modified command
podman exec entdb213 ./setPassword.sh oracle
Changing the password to something I can easily remember.
Note: There are several shell scripts included in this container; one of which is the change password script. There are more details on the Oracle Container Registry > Oracle Database Enterprise Edition page (redirects prevent me from linking directly to that page).
Downloading ORDS
Next, I’ll head to the Oracle REST Data Services download page. And download the latest ORDS build (I’ll be setting this up shortly, I’m just gathering and configuring everything now).
Once that ZIP file is in my downloads folder, I’ll unzip it. At this point, this folder will still be named ords latest. You can certainly keep it like that, but I’ve renamed it to ords_product_folder. This is similar to how we refer to it in our installation/configuration documentation (changing it might make it easier to follow along).
ORDS Configuration
There are two configuration steps I need to perform before I can begin the ORDS installation. You’ll need to set an Environment Variable to the binaries (these are in the bin folder, you should see that in the above image) found in the ords_product_folder. Secondly, you’ll need to create an ORDS Configuration folder.
WAIT: If you're still reading this, how is my approach? After some research, placing these two folders in the "Library" seemed to make the most sense. I'm not sure what the analog on a Windows machine would be though. Drop a comment below if you know!
At this point, I’m nearly finished with this initial configuration. I next opened my .zprofile file (this is the file where I’m keeping most of my environment variables) and added the following paths:
Pourquoi? I can't seem to find a definitive answer as to where these paths should be saved, but this thread on Stack Exchange does a great job explaining all these files (and when/where they are used).
ORDS Installation
You’ll want to exit out of all your Terminal sessions so that the next session can pick up those changes to the .zprofile file. Podman will still keep doing its thing in the background, and hopefully, by this time, the database container will display as “healthy.”
๐ STOP: This may be obvious to you, but it wasn't to me, the database needs to be ready (healthy), online, and active (whatever you want to call it) for ORDS to install. You can always issue the podman ps command to check the status of the container.
Remember this; you’ll need it shortly.
ORDS install, the first attempt
In a new Terminal, I’ll issue the ords installcommand. If you’ve set up your bin and config environmental variables like me, then you shouldn’t have any issues. Moving through the first few steps is easy.
The ORDS interactive installer will default to recommended settings. Most of the time, these will be correct. Since this is my first time installing ORDS, I’ll choose Option 2 in that first step. I can use “localhost” as the database hostname and 1521 as the port.
When you get to the database service name, that’s where you might get hung up. The ORDS installer assumes default configuration settings. But here, if you select “orcl” as the database service name, it will install ORDS in the entire database. This is not technically incorrect, but our ORDS Best Practices recommends you install ORDS in a Pluggable Database (PDB). So I’ll issue the podman logscommand (in my case: podman logs entdb213) to find the name of the PDB; ORCLPDB1 (that’s the default for this container, it’s well-documented in the container registry docs, I’m just an idiot).
This is a neat trick, but it’s also documented in the Container Registry docs.
The ORDS interactive installer is very forgiving. I’ve noticed I can exit out of the installation process pretty much anywhere. Given that fact, I’ll restart my Terminal and start the ORDS install over (now that I have the correct database service name).
ORDS install, for real this time
In these images, you can better see the ords installcommand (it was slightly grayed out in the previous image). And you’ll also see all the steps completed. You can probably keep everything default as I did. You might want to since our documentation refers to that 8080 port (as seen in the images) in most of our tutorials and examples. I find it easier to follow along with the docs when everything matches.
๐๏ธ NOTE: Make sure you choose to run ORDS in standalone mode. That way you can continue to follow along in later steps.
Here you can see the paths to the bin and config folders.Conditions are perfect.
The rest of the installation is largely unremarkable. You will notice a few things, though:
The paths we’ve pointed to for the config and bin folders
The “settings” names(no action required by you, it’s just cool to see)
once the installation finishes, the text “Oracle REST Data Services initialized” will appear
That final image confirms that ORDS is now running in standalone mode. You can visit the following:
localhost:8080/ords/sql-developer
Logging into the SQL Worksheet, the first attempt
And try to log in with the SYS credentials.
One does not simply log into Database Actions…you must REST-enable a user first.
SPOILER ALERT: You can't ๐! Muahahahahaha!
That was a dirty trick. While ORDS is installed in ORCLPDB1, we must first create and REST-enable a user. I will make up for this dirty trick by sharing one of my favorite cheat codes for learning SQL and PL/SQL.
๐๏ธ NOTE: You cannot REST-enable the SYSTEM or SYS user.
If lucky enough, you have an Oracle Cloud Free Tier account with at least one Autonomous database provisioned.
CORPORATE SHILL ALERT: You can sign up for one here ๐.
Code cheating with the Autonomous database
Login to Database Actions as the Administrator. Navigate to User Management. In that dashboard, select the + Create User button.
In the Administration section.Click the Create User button.
When the slider appears, enter the information for the ORDSTEST user (like you see in the image here).
Once you’ve entered everything in, hit that “Show code” switch.
You’ll need to enable “Web Access” for this user. When you do this (click the switch), two things will happen:
The CONNECT and RESOURCE roles will be automatically selected for you
The Authorization required toggle will be enabled – shut this off for now
Once you’ve done that, you can select the “Show code” switch at the bottom of the slider. This will reveal the actual code that is being executed should you click the “Create User” button (which you will most certainly NOT!).
Copy and paste this into a text editor.
I copied this code and placed it into a text editor. I made one small change to the QUOTA line (at the bottom of the script).
Stole this from the ORDS Quick Start Guide.I am making a slight change to that last line.
Then I headed back to my Terminal and opened a new tab. I’ll execute this code in the database (remember, it’s in that Podman container running idle in the background this entire time) using SQLcl.
Forgot your connection string?
IF YOU FORGET the connection string format for logging in, have no fear! That Jeff Smith recently showed me the history command. I also have a couple of shorts on how I used the command:
Using the history + the number of your choice.Here is a connection string that is close enough!Entering the slightly modified string into SQLcl.
Using SQLcl to REST-enable a user
Now that I have the proper format for the connection string, I’ll adjust it so the password is correct. Then I’ll execute the code in SQLcl to create a new database user and REST-enable that user’s schema.
I changed that final line; this is what it looks like in SQLcl.
Cheat code with PL/SQL in SQLcl
I’ve just learned you can REST-enable yourself by logging into SQLcl (i.e., connecting to the database in the Podman container) and issuing the following command:
EXECUTE ORDS.ENABLE_SCHEMA;
This command assumes that you have already been granted the CONNECT and RESOURCE roles but have yet to REST-enable your schema (what allows ORDS to act as the intermediary between the database and the rest of the web).
The command will employ the procedure’s default parameters, which are:
ORDS.ENABLE_SCHEMA( p_enabled IN boolean DEFAULT TRUE, p_schema IN ords_schemas.parsing_schema%type DEFAULT NULL, p_url_mapping_type IN ords_url_mappings.type%type DEFAULT 'BASE_PATH', p_url_mapping_pattern IN ords_url_mappings.pattern%type DEFAULT NULL, p_auto_rest_auth IN boolean DEFAULT NULL);
Here is what a sample output would look like, if I were signed in as the HR user:
An example output is if I were signed in as the HR user.
๐ FYI: This above image is just a sample, and not related to the rest of the images in this article. Be sure to pay attention to the connection string (sql hr/oracle@localhost:1521/freepdb1). This is if the HR user is logging into SQLcl and REST-enabling their own schema. That's why you see references to HR throughout. I don't want anybody to get confused!
Logging into the SQL Worksheet, for real this time
With all this code executed, I can NOW go back to the SQL Worksheet (remember, we’re on localhost:8080/ords/sql-developer) and log in as the newly created ORDSTEST user.
I am logging in as the ORDSTEST user.A true “will-they, won’t they” moment.Congrats, you have arrived!
And once you’re in, you’ll notice the SQL Worksheet is no different than what you might expect in the Oracle Autonomous database. So if you made it this far, go forth and CREATE, DELETE, INSERT, SELECT, and Query away, friends!
Shutting it all down
Once you are done playing and tinkering, you can log out of the SQL Worksheet, stop the ORDS process with CTL + C (on Mac, at least), stop the Podman container, and shut down the Podman virtual machine.
Use Control + C to stop the ORDS process.I am stopping the entdb213 container.Exiting from the Podman virtual machine.
And since we set this all up with a volume (so….so long ago, I know; we called it entdb213vol), you can start the container later on, and all your work will still be there (i.e., It shall persist!).
The end
Congrats, we made it! What do you think? Did I miss anything? If so, comment, and I’ll respond and update this post as needed. And if you think this could be useful for others, do share!
The plan was to create an ORACLE REST endpoint and then POST a CSV file to that auto-REST enabled table (you can see how I did that here, in section two of my most recent article). But, instead of doing this manually, I wanted to automate this POST request using Apple’s Automator application…
Me…two paragraphs from now
Follow along with the video
The plan
I did it. I went so far down the rabbit hole, I almost didn’t make it back alive. I don’t know when this ridiculous idea popped into my head, but it’s been well over a year. Until now, I either hadn’t had the time or the confidence to really tackle it.
The plan was to create an ORACLE REST endpoint and then POST a CSV file to that auto-REST enabled table (you can see how I did that here, in section two of my most recent article). But, instead of doing this manually, I wanted to automate this POST request using Apple’s Automator application.
The use case I made up was one where a person would need to periodically feed data into a table. The data doesn’t change, nor does the target table. Here is an example of the table I’m using:
The basic structure of the Bank Transfers table
And the DDL, should there be any interest:
CREATE TABLE ADMIN.BANK_TRANSFERS
(TXN_ID NUMBER ,
SRC_ACCT_ID NUMBER ,
DST_ACCT_ID NUMBER ,
DESCRIPTION VARCHAR2 (4000) ,
AMOUNT NUMBER
)
TABLESPACE DATA
LOGGING
;
Once this table was created, I auto-REST enabled the table and retrieved the complete cURL Command for performing a Batch Loadrequest. Remember, we have three examples for cURL Commands now, I chose Bash since I’m on a Mac:
Retrieving the the Batch Load cURL Command
Once I grabbed the cURL Command, I would temporarily save it to a clipboard (e.g. VS Code, TextEdit, etc.). I’d then create a new folder on my desktop.
The newly created ords_curl_post folder
How I actually did it
I’d then search via Spotlight for the Automator application. Once there, I’d choose Folder Action.
Choosing Folder Action for this automation
HEY!! README: I'm going to breeze through this. And it may seem like I am well-aquainted with this application. I am not.I spent hours debugging, reading through old StackExchange forums, and Apple documentation so I could share this with you. There is a ton more work to do. But bottom line, this thing works, and its something that is FREE and accessible for a lot of people. You could have a TON of fun with this stuff, so keep reading!
There’s no easy way to get around this, but to get really good at this, you’ll just need to tinker. Luckily, most of these automation modules are very intuitive. And there is a ton of information online on how to piece them all together.
Automator ๐ค
All of these modules are drag-and-drop, so it makes it easy to create an execution path for your Folder Action application. Eventually, I ended up with this (don’t worry, I’ll break it down some, a video is in the works for a more detailed overview):
Complete Folder Action automation for the ORDS Batch Load request
The modules
The modules I’m using are:
Get Specified Finder Items
Get Folder Contents
Run Shell Script (for a zsh shell, the default for this MacBook)
Set Value of Variable
Get Value of Variable
Display Notification
You can see at the very top, that I have to choose a target folder since this is a folder action. I chose the folder I created; ords_curl_post.
Get Specified Finder Items and Get Folder Contents
The first two modules are pretty straightforward. You get the specified finder items (from that specific folder). And then get the contents from that folder (whatever CSV file I drop in there). That will act as a trigger for running the shell script (where the filename/s serve as the input for the cURL Command).
PAUSE: I must confess, I had essentially ZERO experience in shell scripting prior to this, and I got it to work. Its probably not the prettiest, but damn if I'm not stoked that this thing actually does what it is supposed to do.
The only main considerations on this shell script are that you’ll want to stay with zsh and you’ll want to choose “as arguments” in the “Pass input” dropdown menu. Choosing “as arguments” allows you to take that file name and apply it to the For Loop in the shell script. I removed the echo "$f" because all it was doing was printing out the file name (which makes sense since it was the variable in this script).
Choosing “as arguments“
The Shell Script
That cURL Command I copied from earlier looks like this:
I made some modifications though. I made sure Content-Type was text/csv. And then I added some fancy options for additional information (more details on this here, go nuts) when I get a response from the database.
REMINDER: I didn't know how to do this until about 30 mins before I got it to work. I'm emphasizing this because I want to drive home the point that with time and some trial-and-error, you too can get something like this to work!
With my changes, the new cURL Command looks like this:
What a mess…That -w option stands for write-out. When I receive the response from the Batch Load request, I’ll want the following information:
Response Code (e.g. like a 200 or 400)
Total Upload Time
Upload Speed
Upload Size
All of that is completely optional. I just thought it would be neat to show it. Although, as you’ll see in a little bit, Apple notifications has some weird behavior at times so you don’t really get to see all of the output.
I then applied the cURL command to the shell script, (with some slight modifications to the For Loop), and it ended up looking like this:
New shells script with updated cURL command
Here is what the output looked like when I did a test run (with a sample CSV):
Success on the cURL command
Set Value of Variable
All of that output, referred to as “Results”, will then be set as a variable. That variable will be henceforth known as the responseOutput (Fun fact: that is called Camel casing…I learned that like 3-4 months ago). You’ll first need to create the variable, and once you run the folder action, it’ll apply the results to that variable. Like this:
Creating a new variableResults from cURL command applied to variable
Get Value of Variable and Display Notification
Those next two modules simply “GET” that value of the variable/results and then sends that value to the Display Notification module. This section is unremarkable, moving on.
And at this point, I was done. All I needed to do was save the script and then move on to the next step.
Folder Actions Setup
None of this will really work as intended until you perform one final step. I’ll right-click the target folder and select “Folder Actions Setup.” From there a dialog will appear; you’ll want to make sure both the folder and the script are checked.
Selecting Folder Actions SetupDouble checking that everything is enabled
Trying it out
Next, I emptied the folder. Then I dropped in a 5000-row CSV file and let Folder Actions do its thing. This entire process is quick! I’m loving the notification, but the “Show” button simply does not work (I think that is a macOS quirk though). However, when I go back to my Autonomous Database, I can 100% confirm that this ORDS Batch Load worked.
Successful Batch LoadDouble checking the Autonomous Database
Final thoughts
This was relatively easy to do. In total, it took me about 3-4 days of research and trial and error to get this working. There is a lot I do not know about shell scripting. But even with a rudimentary understanding, you too can get this to work.
Next, I’d like to create a dialog window for the notification (the output from the cURL Command). I believe you can do that in AppleScript; I just don’t know how yet.
If you are reading this and can think of anything, please leave a message! If you want to try it out for yourself, I’ve shared the entire workbook on my GitHub repo; which can be found here.
I’ll also be doing an extended video review of this, where I’ll recreate the entire automation from start to finish. Be on the lookout for that too!
Overview and connecting with the python-oracledb library
Part II
Connecting with Oracle REST APIs unauthenticated
Part III
Custom Oracle REST APIs with OAuth2.0 Authorization
Welcome back
I finally had a break in my PM duties to share a small afternoon project [I started a few weeks ago]. I challenged myself to a brief Python coding exercise. I wanted to develop some code that allowed me to connect to my Autonomous Database using either our python-oracledb driver (library) or with Oracle REST Data Services (ORDS).
I undertook this effort as I also wanted to make some comparisons and maybe draw some conclusions from these different approaches.
NOTE: If you don't feel like reading this drivel, you can jump straight to the repository where this code lives. It's all nicely commented and has everything you need to get it to work. You can check that out here.
The test files
Reviewing the code, I’ve created three Python test files. test1.py relies on the python-oracledb library to connect to an Oracle Autonomous database while test2.py and test3.py rely on ORDS (test3.py uses OAuth2.0, but more on that later).
test1.py using the python-oracledb librarytest2.py relies on an unsecured ORDS endpointtest3.py with ORDS, secured with OAuth2
Configuration
Configuration directory
I set up this configuration directory (config_dir) to abstract sensitive information from the test files. My ewallet.pem and tnsnames.ora files live in this config_dir. These are both required for Mutual TLS (mTLS) connection to an Oracle Autonomous database (you can find additional details on mTLS in the docs here).
ewallet.pem and tnsnames.ora files
Other files
OAuth2.0, Test URLs, and Wallet Credential files
Other files include oauth2creds.py, testurls.py, and walletcredentials.py. Depending on the test case, I’ll use some or all of these files (you’ll see that shortly).
NOTE: If not obvious to you, I wouldn't put any sensitive information into a public git repository.
Connecting with python-oracledb
One approach to connecting via your Oracle database is with the python-oracledb driver (library). An Oracle team created this library (people much more experienced and wiser than me), and it makes connecting with Python possible.
FYI: I’m connecting to my Autonomous Database. If you want to try this, refer to the documentation for using this library and the Autonomous database. You can find that here.
The Python code that I came up with to make this work:
#Connecting to an Oracle Autonomous Database using the Python-OracleDB driver.
import oracledb
# A separate python file I created and later import here. It contains my credentials, so as not to show them in this script here.
from walletcredentials import uname, pwd, cdir, wltloc, wltpwd, dsn
# Requires a config directory with ewallet.pem and tnsnames.ora files.
with oracledb.connect(user=uname, password=pwd, dsn=dsn, config_dir=cdir, wallet_location=wltloc, wallet_password=wltpwd) as connection:
with connection.cursor() as cursor:
# SQL statements should not contain a trailing semicolon (โ;โ) or forward slash (โ/โ).
sql = """select * from BUSCONFIND where location='ZAF'
order by value ASC """
for r in cursor.execute(sql):
print(r)
In Line 7, you can see how I import the wallet credentials from the walletcredentials.py file. Without that information, this code wouldn’t work. I also import the database username, password, and configuration directory (which includes the ewallet.pem and tnsnames.ora files).
From there, the code is pretty straightforward. However, some library-specific syntax is required (the complete details are in the docs, found here), but aside from that, nothing is too complicated. You’ll see the SQL statement in Lines 16-17; the proper SQL format looks like this:
SELECT * FROM busconfind WHERE location='zaf'
ORDER BY value ASC;
And here is an example of this SQL output in a SQL Worksheet (in Database Actions):
Reviewing the SQL in Database Actions
FYI: This is a Business Confidence Index data-set, in case you were curious (retrieved here).
That SQL allows me to filter on a Location and then return those results in ascending orderaccording to the Value column. When I do this using the python-oracledb driver, I should expect to see the same results.
NOTE: You've probably noticed that the SQL in the python file differs from that seen in the SQL Worksheet. That is because you need to escape the single quotes surrounding ZAF, as well as remove the trailing semi-colon in the SQL statement. Its all in the python-oracledb documentation, you just have to be aware of this.
Once I have all the necessary information in my walletcredentials.py file, I can import that into the test1.py file and execute the code. I chose to run this in an Interactive Window (I’m using VS Code), but you can also do this in your Terminal. In the images (from left to right), you’ll see the test1.py file, then a summary of the output from that SQL query (contained in the test1.py code), and finally, the detailed output (in a text editor).
Executing the Python code in an Interactive WindowSummary output from test1.pyDetailed output from test1.py
Wrap-up
For those that have an existing Free Tier tenancy, this could be a good option for you. Of course, you have to do some light administration. But if you have gone through the steps to create an Autonomous database in your cloud tenancy, you probably know where to look for the tnsnames.ora and other database wallet files.
I’m not a developer, but I think it would be nice to simplify the business logic found in this Python code. Maybe better to abstract it completely. For prototyping an application (perhaps one that isn’t micro services-oriented, this could work) or for data- and business analysts, this could do the trick for you. In fact, the data is returned to you in rows of tuples; so turning this into a CSV or reading it into a data analysis library (such as pandas) should be fairly easy!
Connecting via ORDS: sans OAuth2.0
Auto-REST and cURL
I’m still using the “devuser” (although this may be unnecessary, as any unsecured REST-enabled table would do). I’m using the same table as before; the only change I’ve made is to auto-REST enable the BUSCONFIND table for the test2.py code.
In the following images, I’m retrieving the cURL command for performing a GET request on this table.
NOTE: In a recent ORDS update, we made available different shell variations (this will depend on your OS); I've selected Bash.
From there, I take the URI (learn more on URIs) portion of the cURL command and place it into my browser. Since this table is auto-REST enabled, I’ll only receive 25 rows from this table.
NOTE: The ORDS default pagination is limit = 25.
Getting the cURL command from an already ORDS REST-enabled tableSelecting the GET request for BashGET response in JSONThe raw JSON, pretty printed
The code
And the code for this test2.py looks like this:
# Auto-REST enabled with ORDS; in an Oracle Autonomous Database with query parameters.
import requests
import pprint
# Importing the base URI from this python file.
from testurls import test2_url
# An unprotected endpoint that has been "switched on" with the ORDS Auto-REST enable feature.
# Query parameters can be added/passed to the Base URI for GET-ing more discrete information.
url = (test2_url + '?q={"location":"ZAF","value":{"$gt":100},"$orderby":{"value":"asc"}}}')
# For prototyping an application, in its earlier stages, this could really work. On your front end, you
# expect the user to make certain selections, and you'll still pass those as parameters.
# But here, you do this as a query string. In later stages, you may want to streamline your application
# code by placing all this into a PL/SQL or SQL statement. Thereby separating application
# logic and business logic. You'll see this approach in the test3.py file.
# This works, but you can see how it gets verbose, quick. Its a great jumping-off point.
responsefromadb = requests.get(url)
pprint.pprint(responsefromadb.json())
Lines 8 and 13 are the two areas to focus on in this example. In Line 8 imported my URL from the testurls.py file (again, abstracting it, so it’s not in the main body of the code).
The test2.py and testurls.py files
And then, in Line 13, I appended a query string to the end of that URL. ORDS expects the query parameters to be a JSON object with the following syntax:
[ORDS Endpoint]/?q={"JSON Key": "JSON Value"}
The new, complete query string below requests the same information as was requested in the test1.py example:
This string begins with that same BASE URI for the ORDS endpoint (the auto-REST enabled BUSCONFIND table) and then applies the query string prefix “?q=” followed by the following parameters:
Filter by the location "ZAF"
Limit the search of these locations to values (in the Value column) greater than ($gt) 100
Return these results in ascending order (asc) of the Value column
NOTE: You can manipulate the offsets and limits in the python-oracledb driver too. More info found here. And filtering in queries with ORDS can be found here.
And if I run the test2.py code in the VS Code Interactive Window, I’ll see the following summary output.
Summary output from the response in test2.py
Here is a more detailed view in the VS Code text editor:
Detailed output with helpful links
Wrap-up
A slightly different approach, right? The data is all there, similar to what you saw in the test1.py example. There are a few things to note, though:
The consumer of this ORDS REST API doesn’t need access to the database (i.e. you don’t need to be an admin or have a schema); you can perform GET requests on this URI.
The response body is in JSON (ubiquitous across the web and web applications)
Also, language and framework agnostic (the JSON can be consumed/used widely, and not just with Python)
You are provided a URI for each item (i.e. entry, row, etc.)
No need for SQL; just filter with the JSON query parameters
No business logic in the application code
Needless to say, no ORMs or database modeling is required for this approach
However…security is, ahem…nonexistent. That is a problem and flies in the face of what we recommend in our ORDS Best Practices.
Connecting via ORDS: secured with OAuth2
Note: This is an abbreviated explanation, I'll be posting an expanded write-up on this example post haste!
Since this is what I’m considering “advanced” (it’s not difficult, there are just many pieces) I’m going to keep this section brief. Long story short, I’ll take those query parameters from above and place them into what is referred to as a Resource Handler.
TIME-OUT: Auto-REST enabling a database object (the BUSCONFIND table in this case) is simple in Database Actions. Its a simple left-click > REST-enable. You saw that in the previous example. You are provided an endpoint and you can use the query parameters (i.e. the JSON {key: value} pairs) to access whatever you need from that object.
However, creating a custom ORDS REST endpoint is a little different. First you create a Resource Module, next a (or many) Resource Template/s, and then a (or many) Resource Handler/s. In that Resource Handler, you'll find the related business logic code for that particular HTTP operation (the menu includes: GET, POST, PUT, and DELETE).
The Resource Module
The process of creating a custom ORDS API might be difficult to visualize, so I’ll include the steps I took along with a sample query (in that Resource Handler) to help illustrate.
Creating the Resource Module in the ORDS REST WorkshopCreating the Resource TemplateReviewing the available operations for the Resource TemplateThe newly created Resource GET HandlerPlacing the SQL directly into the Resource HandlerTesting out the code to simulate a GET request using "ZAF" as the locationReviewing the output of that SQL query, in a table format
Chances are you may be the administrator of your Always Free tenancy, so you have full control over this. Other times, you might be provided the REST endpoint. In that case, you may not ever have to worry about these steps. Either way, you can see how we’re simulating (as well as both abstracting and keeping the business logic in the database) the query with this final example (test3.py).
Security
The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf.
RFC 6749: The OAuth 2.0 Authorization Framework
I’ll keep this section brief, but I’m protecting this resource through the aid of an ORDS OAuth2.0 client. I’ve created one here:
After creating a client you can use the provided URL for requesting a new Bearer Token
And, as you’ll see shortly, I’ll rely on some Python libraries for requesting an Authorization Token to use with the related Client ID and Client Secret. If you want to nerd out on the OAuth2.0 framework, I challenge you to read this.
test3.py example
NOTE: Remember, I'm keeping this section intentionally brief. It deserves a slightly deeper dive, and class is almost over (so I'm running out of time).
The code for this example:
# Custom ORDS Module in an Oracle Autonomous Database.
import requests
from requests_oauthlib import OAuth2Session
from oauthlib.oauth2 import BackendApplicationClient
import pprint
import json
# Importing the base URI from this python file.
from testurls import test3_url
# A separate python file I created and later import here. It contains my credentials,
# so as not to show them in this script here.
from oauth2creds import token_url, client_id, client_secret
token_url = token_url
client_id = client_id
client_secret = client_secret
client = BackendApplicationClient(client_id=client_id)
oauth = OAuth2Session(client=client)
token = oauth.fetch_token(token_url, client_id=client_id, client_secret=client_secret)
bearer_token = token['access_token']
# Location can be anything from the table. Now, only the single variable needs to be passed. Business logic has been abstracted somewhat; as it now resides within
# ORDS. This could make your application more portable (to other languages and frameworks, since there are fewer idiosyncracies and dependencies):
location = "ZAF"
# print(location)
# ------------------------------------------------------------------------------ #
# In Database Actions, we:
# 1. Create an API Module
# 2. Then create a Resource Template
# 3. Finally, a GET Resource Handler that consists of the code from test1.py:
# select * from BUSCONFIND where location= :id
# order by value ASC
# ------------------------------------------------------------------------------ #
url = (test3_url + location)
# print(url)
responsefromadb = requests.get(url, headers={'Authorization': 'Bearer ' + bearer_token}).json()
# This step isn't necessary; it simply prints out the JSON response object in a more readable format.
pprint.pprint(responsefromadb)
Lines 11 and 16 deserve some attention here. The URL for Line 11 comes from the testurls.py file; seen in the previous example. And the contents from Line 16 come from the oauth2creds.py file. Here are the files, side-by-side:
The test3.py, testurls.py, and oauth2creds.py files
As you can see in the testurls.py file, I’m relying on the test3_url for this example. And the OAuth2.0 information you see comes directly from the OAuth Client I created in Database Actions:
In this image, you can see the Client ID and Client Secret
If I put that all together, I can execute the code in test3.py and “pretty print” the response in my Interactive Window. But first I need to adjust the Resource Handler’s URI (the one I copied and pasted from the “REST Workshop”). It retains the “:id” bind parameter. But the way I have this Python code set up, I need to remove it. It ends up going from this:
With that out of the way, I can run this code and review the output.
Running the test3.py code in the Interactive WindowReviewing the summary output – a JSON arrayReviewing the detailed view of the “items“Scrolling to the bottom of the GET response body to see the available links for additional items
From top-to-bottom, left-to-right you’ll see I first execute the code in the Interactive Window. From there I can review a summary of the response to my GET request. That pretty print library allows us to see the JSON array in a more readable format (one that has indentation and nesting); which you can see in the second image. The third image is a more detailed view of the first half of this response. And I include the final image to highlight the helpful URLs that are included in the response body.
Since I know my limit = 25, and the 'hasMore': True (seen in the output in that third image) exists, I know there are more items. You can adjust the limit and offset in subsequent requests, but I’ll save that for another day.
Wrap-up
You can probably tell, but this is like an expansion of the previous example. But instead of relying on the auto-REST enabling, you are in full control of the Resource Module. And while you don’t need to use OAuth2.0 it’s good practice to use it for database authentication. You can see how the response comes through a little differently, compared to the previous example, but still very similar.
In this example, I did all the work, but that might not be the case for you; much of it might be handled for you. The main thing I like about this example is that we rely on stable and popular Python libraries: requests, requests_oauthlib, and oautlib.
The fact that this is delivered as a JSON object is helpful as well (for the same reasons mentioned in the second example). And finally, I enjoy the fact that you only need to pass a single parameter from your (assumed) presentation layer to your application layer; an example might be a selection from an HTML form or drop-down menu item.
The end
We’re at the end of this fun little exercise. As I mentioned before, I will expand on this third example. There are so many steps, and I think it would be helpful for people to see a more detailed walk-through.
And be on the lookout (BOLO) for a video. There’s no way around this, but a video needs to accompany this post.
And finally, you can find all the code I review in this post in my new “blogs” repository on GitHub. I encourage you to clone, fork, spoon, ladle, knife, etc…
That’s right; I’m back again for yet another installment of this ongoing series dedicated to working with Medium.com story stats. I first introduced this topic in a previous post. Maybe you saw it. If not, you can find it here.
Recap
My end goal was to gather all story stats from my Medium account and place them into my Autonomous Database. I wanted to practice my SQL and see if I could derive insights from the data. Unfortunately, gathering said data is complicated.
Pulling the data down was a breeze once I figured out where to look for these story statistics. I had to decipher what I was looking at in the Medium REST API (I suppose that was somewhat tricky). My search was mostly an exercise in patience (there was a lot of trial and error).
I uploaded a quick video in the previous post. But I’ll embed it here so you can see the process for how I found the specific JSON payload.
Obtaining the raw JSON
Once I found that URL, I saved this JSON as a .json file. The images below show remnants of a JavaScript function captured with the rest of the JSON. I’m no JavaScript expert, so I can’t tell what this function does. But before I load this into my Autonomous Database (I’m using an OCI Free Tier account, you can check it out here if you are curious), it needs to go.
JSON response errorMuch nicer JSON presentation
README
I am pointing out a few things that may seem convoluted and unnecessary here. Please take the time to read this section so you can better understand my madness.
FIRST: Yes, you can manually remove the [presumably] JavaScript saved along with the primary JSON payload (see above paragraphs). I'm showing how to do this in Python as a practical exercise. But I'm also leaving open the opportunity for future automation (as it pertains to cleaning data).
SECOND: When it comes to the Pandas data frame steps, of course, you could do all this in Excel, Numbers, or Sheets! Again, the idea here is to show you how I can clean and process this in Python. Sometimes doing things like this in Excel, Numbers, and Sheets is impossible (thinking about enterprise security here).
THIRD: Admittedly, the date-time conversion is hilarious and convoluted. Of course, I could do this in a spreadsheet application. That's not the point. I was showing the function practically and setting myself up for potential future automation.
FOURTH: I'll be the first to admit that the JSON > TXT > JSON > CSV file conversion is comical. So if you have any suggestions, leave a comment here or on my GitHub repository (I'll link below), and I'll attribute you!
The code
Explaining the code in context, with embedded comments, will be most illuminating.
I’ve named everything in the code as literally as possible. In production, this feels like it might be impractical; however, there is no question about what the hell the code is doing! Being more literal is ideal for debugging and code maintenance.
Here is the entire code block (so CTRL+C/CTRL+V to your heart’s content ๐). I’ll still break this down into discrete sections and review them.
import csv
import json
import pandas as pd
import datetime
from pathlib import Path
# You'll first need to sign in to your account, then you can access this URL without issues:
# https://medium.com/@chrishoina/stats/total/1548525600000/1668776608433
# NOTES:
# Replace the "@chrishoina" with your username
# The two numbers you see are Unix Epochs; you can modify those as # needed; in my case, I
# wanted to see the following:
# * 1548525600000 - At the time of this post, this seems to be
# whenever your first post was published or when
# you first created a Medium account. In this case, for me, this
# was Sat, Jan/26/2019, 6:00:00PM - GMT
# * 1665670606216 - You shouldn't need to change this since it will # just default to the current date.
# For the conversion, I an Epoch Converter tool I found online: https://www.epochconverter.com/
# Step 1 - Convert this to a,(.txt) file
p = Path("/Users/choina/Documents/socialstats/1668776608433.json")
p.rename(p.with_suffix('.txt'))
# Step 2 - "read" in that text file, and remove those pesky
# characters/artifacts from position 0 through position 15.
# I'm only retaining the JSON payload from position 16 onward.
with open("/Users/choina/Documents/socialstats/1668776608433.txt", "r") as f:
stats_in_text_file_format = f.read()
# This [16:] essentially means grabbing everything in this range. Since
# there is nothing after the colon; it will just default to the end (which is
# what I want in this case).
cleansed_stats_from_txt_file = stats_in_text_file_format[16:]
print(cleansed_stats_from_txt_file)
# This took me a day to figure out, but this text file needs to be encoded
# properly, so I can save it as a JSON file (which is about to happen). I
# always need to remember this, but I know that the json.dumps = dump
# string, which json.dump = dump object. There is a difference, I'm not
# the expert, but the docs were helpful.
json.dumps(cleansed_stats_from_txt_file)
# Step 3 - Here, I create a new file, then indicate we will "w"rite to it. I take the
# progress from Step 2 and apply it here.
with open('medium_stats_ready_for_pandas.json', 'w') as f:
f.write(cleansed_stats_from_txt_file)
# Step 4 - Onto Pandas! We've already imported the pandas library as "pd."
# We first create a data frame and name the columns. I kept the names
# very similar to avoid confusion. I feared that timestampMs might be a
# reserved word in Oracle DB or too close, so I renamed it.
df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS'])
with open("/Users/choina/Documents/socialstats/medium_stats_ready_for_pandas.json", "r") as f:
data = json.load(f)
data = data['payload']['value']
print(data)
for i in range(0, len(data)):
df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'], data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")
print(df.columns)
# Step 5 - use the Pandas' df.to_csv function and save the data frame as
# a CSV file
with open("medium_stats_ready_for_database_update.csv", "w") as f:
df.to_csv(f, index=False, header=True)
I used several Python libraries I use for this script:
p = Path("/Users/choina/Documents/socialstats/1668776608433.json")
p.rename(p.with_suffix('.txt')
Pathlib allows you to assign the file’s path to “p”. From there, I changed the .json file extension to a .txt extension.
Note: Again, I'm sure there is a better way to do this, so if you're reading, leave a comment here or on my GitHub repository so I can attribute it to you ๐.
The before and after of what this step looks like this:
JSONbeforeTXTafter
With that out of the way, I needed to remove that JavaScript “prefix” in the file. I do this in Step 2 (I got so fancy that I probably reached diminishing returns). My approach works, and I can repurpose this for other applications too!
Step 2:
# Step 2 - "read" in that text file, and remove those pesky
# characters/artifacts from position 0 through position 15. Or in other
# words, you'll retain everything from position 16 onward because that's
# where the actual JSON payload is.
with open("/Users/choina/Documents/socialstats/1668776608433.txt", "r") as f:
stats_in_text_file_format = f.read()
# This [16:] essentially means grabbing everything in this range. Since
# there is nothing after the colon; it will just default to the end (which is
# what I want in this case).
cleansed_stats_from_txt_file = stats_in_text_file_format[16:]
print(cleansed_stats_from_txt_file)
# This took me a day to figure out, but this text file needs to be
# appropriately encoded to save as a JSON file (which is about to
# happen). I always forget the difference between "dump" and "dumps";
# json.dumps = dump string, whereas json.dump = dump object. There is
# a difference, I'm not the expert, but the docs were helpful (you should
# read them).
json.dumps(cleansed_stats_from_txt_file)
I needed to remove these remnants from the Medium JSON response
While this initially came through as a JSON payload, those first 0-15 characters had to go.
FULL DISCLAIMER: I couldn't figure out how to get rid of this while it was still a JSON file hence why I converted this to a text file (this was the only way I could figure it out).
I captured position 16 to infinity (or the end of the file, whichever occurs first), then I re-encoded the file as JSON (I interpreted this as “something the target machine can read and understand as JSON“).
OPEN SEASON: CompSci folks, please roast me in the comments if I'm wrong.
Step 3
# Step 3 - I create a new file, then I'll "w"rite to it. I took the result from Step 2 and applied it here.
with open('medium_stats_ready_for_pandas.json', 'w') as f:
f.write(cleansed_stats_from_txt_file)
I’m still at the data-wrangling portion of this journey, but I’m getting close to the end. I’ll create a new JSON file, take the parts of the (freshly encoded) text file I need, and then save them as that new JSON file.
Step 4
# Step 4 - Onto Pandas! We've already imported the pandas library as "pd"
# I first create a data frame and name the columns. I kept the names
# similar to avoid confusion. I feared that timestampMs might be a
# reserved word in Oracle DB or too close, so I renamed it.
df = pd.DataFrame(columns=['USERID', 'FLAGGEDSPAM', 'STATSDATE', 'UPVOTES', 'READS', 'VIEWS', 'CLAPS', 'SUBSCRIBERS'])
with open("/Users/choina/Documents/socialstats/medium_stats_ready_for_pandas.json", "r") as f:
data = json.load(f)
data = data['payload']['value']
print(data)
for i in range(0, len(data)):
df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'],
data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
df['STATSDATE'] = pd.to_datetime(df['STATSDATE'], unit="ms")
print(df.columns)
I won’t teach Pandas (and honestly, you do NOT want me to be the one to teach you Pandas), but I’ll do my best to explain my process. I first created the structure of my data frame (“df” in this case). And then, I named all the column headers (these can be anything, but I kept them very close to the ones found in the original JSON payload).
I then opened the newly-saved JSON file and extracted what I needed.
NOTE: I got stuck here for about a day and a half, so let me explain this part.
The data['payload']['value'] refers to the key and value in this particular {key: value} pair. This approach allowed me to grab all the values of “value“. This image explains what I started with (on the left) and what I ended up with (on the right).
The before and after JSON payload
You’ll notice a {"success": true} key: value pair. With this method, I removed that pair and shed others at the end of the JSON payload.
Removing a great deal of trash
I can’t take credit for organically coming up with this next part; Kidson on YouTube is my savior. I’d watch this video to understand what is happening in this piece of code entirely:
for i in range(0, len(data)):
df.loc[i] = [data[i]['userId'], data[i]['flaggedSpam'], data[i]['timestampMs'], data[i]['upvotes'],
data[i]['reads'], data[i]['views'], data[i]['claps'], data[i]['updateNotificationSubscribers']]
In short, you take the values from the columns in the JSON file (above) and then put them into the column locations named in this piece of code:
For instance, the "userId" values in the JSON file will all go into the 'USERID' column in the Pandas data frame. And the same thing will happen for the other values and associated (Pandas data frame) columns.
Finally, I changed the date (which, if you recall, is still in this Epoch format) with the Datetime library to a more friendly, readable date. Using this code:
with open("medium_stats_ready_for_database_update.csv", "w") as f:
df.to_csv(f, index=False, header=True)
I’m at the home stretch now. I take everything I’ve done in Pandas and save it as a CSV file. I wanted to keep the headers but ditch any indexing. The clean CSV file will look like this:
Cleaned, tidy CSV ready for Data Load via SQL Developer Web
Step 6
Lastly, I logged into SQL Developer Web and clicked the new Data Load button (introduced in Oracle REST Data Services version 22.3) to upload the CSV file into a new table. The Autonomous Database automatically infers column names and data types. I slightly modified the "statsdate" column (honestly, I could have left it alone, but it was easy enough to change).
Before and After
And that’s it! Once uploaded, I can compare what I did previously to what I have achieved most recently. And both ways are correct. For instance, depending on your requirements, you can retain the JSON payload as a CLOB (as seen in the first image) or a more traditional table format (as seen in the second image).
Medium stats as a CLOBMedium stats in a typical table format
Wrap up
If you’ve made it this far, congrats! You should now have two ways to store Medium stats data in a table (that lives in the Oracle Autonomous Database) either as:
a CLOB
an OG table
And if you’d like to review the code, you can find it here.
I feel so silly for posting this because you’ll quickly realize that I will have to leave things unfinished for now. But I was so excited that I got something to work, that I had to share!
If you’ve been following along, you know you can always find me here. But I do try my best to cross-post on other channels as well:
But given that everything I do supports the development community, audience statistics are always crucial to me. Because of this, I’ll periodically review my stats on this site and the others to get a feel for the most popular topics.
I even did a RegEx post a while back that was pretty popular too. Thankfully it wasn’t that popular, as it pained me to work through Regular Expressions.
I can quickly review site statistics on this blog, but other places, like Medium, are more challenging to decipher. Of course, you can download your Audience stats, but sadly not your Story stats ๐.
Audience stats download, but no Story stats download.
Undeterred, I wanted to see if it was somehow possible to acquire my Story stats. And it is possible, in a way…
Show and tell
If after you log into your Medium account, navigate to your stats page, open up the developer tools in your browser and navigate to your “Console.” From there, reload the page and simply observe all the traffic.
You’ll see a bunch of requests:
GET
POST
OPTION (honestly, I’ve no idea what this is, but I also haven’t looked into it yet)
My thought was that the stats content was produced through (or by) one of these API requests. So yes, I (one at a time) expanded every request and reviewed the Response Body of each request. I did that until I found something useful. And after a few minutes, there it was:
The magic GET request.
I confirmed I had struck gold by taking this URL, placing it in a new browser window, and hitting Enter. And after selecting “Raw Data,” I saw this:
Double-checking the raw JSON.
Indeed, we see my Story stats. But the final two paths in the URL made no sense to me.
The paths looked similar; I had no choice but to activate Turing Modeโข.
I could see these numbers were similar, so I lined them up in my text editor and saw that they shared the same 166 prefixes. I don’t know much about machine-readable code, but since what was appearing on my screen was the last 30 days, I thought this might be some sort of date. But I’d never seen anything like this, so I wasn’t 100% sure.
Unix Time Stamps
After about 20 mins of searching and almost giving up, I found something in our Oracle docs (a MySQL reference guide of all places) that referenced Unix Time Stamps. Eureka!
About Unix time stamps in the Oracle MySQL docs.
Success, I’d found it. So I searched for a “Unix time stamp calculator” and plugged in the numbers. My hunch was correct; it was indeed the last thirty days!
Verifying the Unix Time Stamp.
So now I’m wondering if I change that leading date in the GET request will it allow me to grab all my story statistics from January 2022 till now? Oh, hell yeah, it will!
All my Story stats from Jan 2022 to the present.
End of the line
Right, so here is where I have to leave it open-ended. I had a finite amount of time to work on this today, but what I’d like to do is see if I can authenticate with Basic Authentication into my Medium account. And at least get a 200 Response Code. Oh wait, I already did that!?
Getting that sweet, sweet 200 Response Code.
And now the Python code!
import requests
import json
from requests.auth import HTTPBasicAuth
url = "https://medium.com/m/signin"
# I found this to work even if I typically sign on through
# the Google Single-sign-on. I just used the same email/password
# I do when I login directly to google (Gmail).
user = "[Your login/email]"
password = "[Your password]"
r = requests.get(url, auth=HTTPBasicAuth(user, password))
print(r)
# I found this URL in the console but then removed everything after
# the query string (the "?"), and used that for the requests URL
# "/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2F&source=--------------------------lo_home_nav-----------"
You’re probably wondering how I found the correct URL for the Medium login page. Easy, I trolled the Console until I found the correct URL. This one was a little tricky, but I got it to work after some adjusting. I initially found this:
And since I thought everything after that “?” was an optional querystring, I just removed it and added the relevant parts to Medium’s base URL to get this:
If I want to keep it as is, I know I can load the JSON with a cURL command and an ORDS Batch Load API with ease. I dropped this into my Autonomous Database (Data Load) to see what it would look like:
My CLOB.
We do something very similar in the Oracle LiveLabs workshop (I just wrote about it here). You can access the workshop here!
I’ll have a follow-up to this. But for now, this is the direction I am headed. If you are reading this, and want to see more content like this, let me know! Leave a comment, retweet, like, whatever. So that I know I’m not developing carpal tunnel for no reason ๐คฃ.
Recently Jeff and I were invited by the Oracle Developers and Developer Relations teams to do a walkthrough of a LiveLabs workshop, โHow to Build Powerful and Secure REST APIs for Your Oracle Autonomous Database.โ
We spent about 90 minutes moving through selected labs in the workshop. Luckily they recorded it for us; you can watch it in all its glory here.
If that video piques your interest, I encourage you to complete the workshop since it provides an excellent overview of Oracle REST Data Services APIs โ specifically when working in Database Actions (in the Oracle Autonomous Database).
About the workshop
Labs 1, 2, and 7 are common across many workshops. These were our focus.
The workshop consists of seven labs, but labs 3-6 were the main focus.
Two approaches to REST-enabling your Oracle database objects.
We also wanted to highlight the two ways a user could create Oracle REST APIs in Database Actions (formerly SQL Developer Web). You can jump right in with auto-REST enabling or get creative by building your Resource Modules > Templates > Handlers.
Workshop highlights
I wonโt walk through the labs in detail here, but what I will do is highlight areas that:
Were cool/worth revisiting, or
Have (or continue to) helped speed up my productivity in Database Actions (and through association with the Autonomous Database)
The videos are queued up to the related topic.
Lab 3
Lab 3 walks you through connecting to an Autonomous Database with Database Actions. From there, you create a table from a CSV file. And finally, youโll auto-REST enable the table with simple mouse clicks.
Data Loading
I’ve found no less than three GUI-based ways to load data in Database Actions.
Auto-REST enabling
We are using mouse clicks for auto-REST enabling database objects in the Oracle Autonomous Database.
Show Code toggle
The new “Show Code” toggle switch in Database Actions.
This feature isnโt limited to the SQL Worksheet; it's found across Database Actions!
cURL command options for your environment
cURL commands now provide Power Shell, Command Prompt, and Bash examples.
Lab 4
Lab 4 walks you through using a Batch Load API for loading two million+ rows into the table you previously created (in Lab 3). We also make a SQL procedure and later use PL/SQL to simulate a REST API call to the table.
We briefly discussed the Cloud Shell and Code Editor (both in Oracle Cloud Infrastructure). Click the links to learn more, they are free and included in your OCI tenancy ๐.
A crash course on query parameters
Jeff has a helpful article here (one I reference A LOT).
You can review our docs here (we mention it in several areas).
Graduating from auto-REST
A short discussion on when and why you may want to move away from auto-REST-enabled Oracle APIs to more customized Oracle REST APIs.
Lab 5
In Lab 5, you use Database Actions and the REST console to build a REST API using a parameterized PL/SQL procedure and SQL statement. We do this manually in the previous lab but then REST-enable it here (this is a continuation and refinement of the last lab).
This continues to confound me, so if you are in the same boat as me and you want me to do some more dedicated posts on this, let me know!
Lab 6
The goal of this lab was to educate you on Roles, Privileges, and OAuth 2.0 Client Authentication. Unfortunately, we ran out of time and had to speed through this final section. However, I did show off some of the OpenAPI functions within Database Actions.
OpenAPI Specifications
Specifically, we reviewed how you can view your Resource Modules in the OpenAPI view (displayed as a Swagger UI implementation). And view/execute handlers to observe their responses.
We also mentioned how you can export a Resource Module in either PL/SQL code or the OpenAPI JSON code.
I suspect you should be all set to complete this workshop (located here). But why stop the fun there? We have some other LiveLabs workshops that might interest you, too. You should check them out!
The last workshop on the list is our newest one! So if you do attempt it, feel free to create an issue for enhancements (or if anything is unclear and needs updating) on my GitHub repository ๐!
It was bugging me that I couldnโt perform a simple Python POST request to an ORDS REST-enabled table.
This one actually…
I donโt mean to convey that this isnโt possible. Up until very recently, I wasnโt able to do this. Luckily I had a few hours free, so I took to the docs to do some reading. And wouldnโt you know it, like most things in tech, reading the documentation was a practical and valuable use of my time.
Side note
Let me pause here for a minute or two. I should add this disclaimer that none of what I share here uses OAuth 2.0 authentication. Of course, I wish it did, but Iโm just not there yet (technical proficiency). In the future, Iโd like to update this with security in mind1:
The other thing Iโll mention is that Iโm going to include the SQL for creating this table along with all the Python code in my GitHub repo (Iโll also add any code updates to my repo! will also add any updates I make to this code to the repo!).
Also, did you know that saving DDL with Database Actions is just a mouse click away?
Right-click an object in Database Actions > save DDL to a SQL Worksheet or File.DDL for creating this example table.
New User speed run
This section is a bit of an aside, but I also created a new Python Developer user in one of my Autonomous Databases. Itโs straightforward to do as the admin. Here is a โspeed runโ:
POST haste
After creating my new user, I created a โPython_Postโ table. Super imaginative, right? And I kept things simple, naming the four columns (wait for it): "col1", "col2", "col3", and "col4".
Damn…he actually did that.
Cheat codes
I auto-REST enabled my table and reviewed a Bash cURL command so I could remind myself of the expected data (aka payload).
A Bash cURL command for POST requests; can be used as later reference in your editor.
Iโve noticed that if I donโt specify a โrowidโ the Autonomous Database automatically does this. SQL-newbies (like me) might appreciate this since we still donโt know all the best practices for database design (or DDL, for that matter)!
My process might differ from yours, but Iโve used the cURL commands in Database Actions as a cheat. Iโve been copying/pasting this cURL command into my working Python file, so I donโt have to switch screens. Additionally, it helps to remind me what the {โkeyโ:โvalueโ} pairs are (even though I just created all this, I STILL canโt remember what I did).
In this case, Iโm referencing a POST request, but you could do this for the other HTTPS methods too:
GET ALL
GET
POST
BATCH LOAD
PUT
DELETE
Moving on…
I could omit the โrowidโ when making these POST requests. I donโt know if this is typical when working with databases, but this seems to work consistently (at least with the testing I did) with Python and the Requests library.
If you werenโt taken aback by my imaginative table name and column names, then get ready because Iโm about to blow your mind with this next bit. I created a payload for each of these POST request variations:
And I’m just going to be as literal as possible to avoid confusion…is it lame? Yes. Does it work? Undecided, you tell me.
In case you didn’t catch it, they were: payload1, payload2, and payload3.
On feedback…
I also included some feedback for myself. And I feel like this is a good practice because if you successfully POST something, how would you know? Conversely, the same would be true if you unsuccessfully POSTed something. If I were smart, Iโd design a REST API in Database Actions that automatically includes some feedback. But I am not.
If you want to read about this Implicit Parameter and others, click here.
Luckily, the Python Requests library includes various feedback methods. Iโve included the following in my Python file:
raise_for_status() – this will display the error message (if applicable); displays "None" if the request was successful
headers – returns the serverโs response headers as a Python dictionary
Different payloads
My Python code is simple enough. It looks more than it is because I include three POST request variations. Iโve also tested payloads, and it seems like we can send a "payload" as:
Oracle REST Data Services has a ton of documentation on filtering with query parametersโฆ Iโm still learning about it, but they are powerful. Check them out here.
You can even send payloads consisting of nested items/objects, too (e.g., an array or list as one of your values). Iโm sure this violates normalization rules, but itโs possible. In my case, I didnโt do this; I just stuck to my four columns.
Executing the code
After I executed the code, I received the following feedback for these POST requests:
If you squint, you can see three separate responses. I’ve only boxed one, but you should be able to point out the other two.
You should see:
a 201 status code, which indicates “the request has been fulfilled and has resulted in one or more new resources being created2.
None – which if there were a 400 error, that would show me the error message
While in VS Code, the POSTrequests appeared successful. But since I had access, I logged into Database Actions and manually inspected my table. Success!
Reviewing the new updates in Database Actions.
And thatโs itโฆzero to POST request in my Oracle Autonomous Database + ORDS in no time!
Was this helpful?
I have this code in my GitHub repository, so feel free to fork, download, or add comments. But Iโll include here too:
And if you are still hereโโโwas this helpful? Want to see more or something else that is Python + ORDS-related? Let me know here in the comments, on Twitter, or by email!
If you’ve been following along, then you are probably aware of the python/flask/ORDS journey that I’ve embarked on. If not, you can read up on the overview here. The idea was to take local restaurant health inspection scores (aka Sanitary grades) and present the data to a user in a visualization. The web framework I’m using is based on Flask, a popular Python web application framework. And we’ll interact with our Oracle Autonomous Database using our Oracle REST Data Services APIs (for both GET and POST requests).
That progress is moving along, and I’ve been toying with the Folium library for map visualizations.
If you are familiar with Leaflet.js then you basically know Folium – a python wrapper used for displaying map renderings from Leaflet. You can get a taste of what I’m working towards in this latest post.
However, this project is on hold (very, very) temporarily while I become familiar with another Python / Flask / ORDS web application.
eCommerce/Retail Web Application [NEW]
Currently titled “Frank’s Franks”, this application is what I’m referring to as an eCommerce type application. You are presented with product information, you then make a product selection, and create an order. There is no credit card gateway, but the application does actually POST order information to a live Autonomous Database.
This application is the brainchild of our very own Brian Spendolini. But I’m taking a crack at making some additions to it with what I’ve learned from Folium and Bootstrap (HTML and CSS). I’m not sure of the exact final state, but you can follow along as we continue iterating on it (more on this in the GitHub section).
And here is a very crude example of the index (aka “home”) page:
And here is a nice screenshot of the python code along with some in-progress HTML:
Our Flask application is on the left, with the two additional right panes showing the base.html and index.html pages.
I’m showing this because this will be one of the practical labs that our team will be facilitating in October. Why October you ask? Great question…
Oracle CloudWorld
Because that is when Oracle CloudWorld is happening. Queue gratituous copy/paste in 3…2…1…
Join us at CloudWorld, the new global conference where our customers and partners can share ideas, develop in-demand skills, and learn about cloud infrastructure and applications solutions that serve their unique roles and business needs. At CloudWorld, youโll experience first-look product demos by Oracle experts, access hands-on training and certification opportunities, and shape the cloud innovation conversation alongside your peers and industry leaders.
Now I’m not sure who on the team will be attending, but rest assured, one of us will be there.
GitHub
Finally, GitHub. Why am I including an ominous “GitHub” section? For two reasons. First, if you want to follow along with the progress of this CloudWorld lab, I encourage you to peruse my repo, which can be found here.
Second, I’ve started to build out a python-themed library that will document code snippets and frequent code patterns when working with popular libraries + ORDS. The obvious ones are things like “requests” and “pandas”. And a lot of it is things that I have encountered personally. But I’m discovering there is so much more out there for python folks, so hopefully, this will continue to expand.
If you do visit the “library” you’ll see that I have created some ORDS endpoints based on historical U.S. home sale data. You can check it out here.
Current state:
Those endpoints are actually enabled, so you can work with them without having to authenticate. Now of course you could totally mess things up and POST or delete whatever...but lets try to avoid that.
But wait, there’s more
You have both an invitation and a request.
An invitation to follow along with both the lab progress and the python library project.
And a request to
submit ideas and possible areas of improvement (or cool code patterns) for this CloudWorld application,
and submit any requests you might have for the python library that I’ve started to build-out
This is all voluntary of course, but if you think any of this might benefit please do engage and share!
Finally, I’ll be back to regularly-scheduled technical posts soon. In the meantime…
I stumbled upon a new [to me] python library called Folium. It’s a mapping tool that enables python developers (or is it programmers, which is less offensive?) to visualize data on a Leaflet map.
About folium
Folium makes it easy to visualize data thatโs been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map.
The library has a number ofbuilt-in tilesets from OpenStreetMap, Mapbox, and Stamen, and supports custom tilesets with Mapbox or Cloudmade API keys. Folium supports Image/Video, GeoJSON and TopoJSON overlays.
Leaflet.js is the leading open-source JavaScript library for mobile-friendly interactive maps. Out-of-the-box features include: tile layers/WMS, markers/popups, vector layers (polylines, polygons, circles, rectangles), image overlays, and GeoJSON
I highlighted the tilesets above, and I’m not sure if this is a GIS or a leaflet term, but it seems that tilesets refer to different map renderings. Folium includes the following:
OpenStreetMap
Stamen Terrain
Stamen Toner
Stamen Watercolor
At the time of writing, I didn’t have API credentials for Mapbox Bright or Mapbox Control Room. But I’m assuming they still work if you have the appropriate API credentials.
Excluding the Mapbox tiles, those packaged by default are more than sufficient:
Tip: Read more on Stamen here, s'il vous plaรฎt. They seem to be the original creators of these Stamen tiles.
Python + Folium + ORDS = Happiness
Folium installation and quick start pages are straightforward – requiring minimal steps and effort. The same goes for installing the Live Server extension too. Once installed, you can right-click on a .html file to load it onto a local development server (so it displays in a new browser window).
Let me jump right into how I set up Folium to work with ORDS. If you recall, most of the tables I have in my Autonomous Databases are REST-enabled (previous post working with ORDS).
An ORDS Review
1. Once in Database Actions, right-click on an object and select REST > Enable.
2. Choose an alias (good security measure) along with Authentication (also good practice).
3. If you forget the URI, you can review the cURL command.
4. There you can copy/paste directly into a Command-Line or copy the URI portion and place into your browser’s address bar.
…and now back to your regularly scheduled blog post
Since I needed latitude and longitude coordinates for Folium, I decided to work with my Restaurant table. Put the REST endpoint (URI) directly into the browser; and you’ll see this:
Afterward, I looked at the JSON object to see how it was structured (there are too many lists, dictionaries, arrays, etc. to remember amirite?!) because I’d need to iterate through all the items in the GET Response – to tease out what Folium needs.
The play-by-play
Printing the response (to see what it looks like)
I’m covering all my bases here; reviewing the ORDS-provided request/object from a table in my Autonomous Database
Using json.() to “decode” it into a python dictionary
Please pay attention to line 10, where I actually create the base map
Review of what Folium needs
Looping through the python dictionary, pulling out what I need for the next step
Including lines 25-29 to create the pop-up markers.
The ('<i>{}</i>'.format(Resty)) allows me to pass the names to the string (stuff in quotes, plus the {}). The HTML for italics is optional.
In this case, ‘x’ and ‘y’ are the coordinates
Line 31 saves a .html file (which uses Bootstrap!)
Newly rendered .HTML
Using the Live Server extension, I can review the Folium map plus the pop-up markers!
NOTE: This isn't the focus, but spoiler alert, I'll be putting this into a Flask application. I'll also build this out, so the individual markers/popups have charts/graphs for each restaurant. Again, all of which will be provided by ORDS-enabled tables (not so shameless plug).
Summary
After it is all said and done, it took me less than an afternoon to figure out how to make this work. So if you have a table with latitude and longitude, what are you waiting for?! REST-enable it with ORDS, and you can come up with a very quick demo on your own!
Also, I didn’t have to create any database models, the dependencies are minimal (a few python libraries), and an Always Free OCI account is really all you need to get started.
Code
import json
import requests
import folium
m = folium.Map(location=[35.78742648626059, -78.78122033558192], zoom_start=12, tiles="Stamen Watercolor")
response = requests.get('your ORDS enpoint/URI').json()
# print(type(response))
# print(response)
for entry in response['items']:
Resty = entry['name']
Lat = entry['y']
Long = entry['x']
folium.Marker(
location=[Lat, Long],
popup = folium.Popup('<i>{}</i>'.format(Resty))
#this line is optional/an alternative to the above line
# popup = folium.Popup('<i>{}</i>'.format(entry['name']))
).add_to(m)
m.save("index.html")
[I did not] reinvent the wheel
Lastly, much of what I learned about Folium came from the following two videos (I recommend bookmarking them for reference):