genetic decipher

I've written fairly extensively on AWS API Gateway (APIGW) and many of the lesser-known features. Today we're going to dive deep on the model validation feature of APIGW and specifically how to handle unexpected input.

The most common use for APIGW is to leverage the "AWS_PROXY" integration that directly connects APIGW to AWS Lambda (this is called "Lambda Function" in the APIGW console). There are several benefits to this approach, including reduced latency, simpler implementation, and more control over the shape of the API response object. Today, I am going to show you how to achieve that same level of control natively with APIGW and use those same techniques to enforce the shape of the request object. We'll wrap up with a discussion of some of the limits of this feature (AWS Tech Evangelists take note!)

Let's start with a brief introduction to the various parts of APIGW. Each API consists of Resources, Methods, Method Request, Integration Request, Integration Response, and Method Response.

* Resources: These are the paths of your API, and API will start with a single resource called the "root" resource and is denoted by "/".

Adding a "child" named "robot" to this resource would look like "/robot".

The robot resource could have "siblings" such as "/mop" or "/vacuum". Likewise, the robot resource could have children of its own, such as "/robot/biped" or "/robot/tracked"

Here is the finished set of resources.

* Methods: methods are the HTTP verbs that describe an interaction with a resource. These are [GET, POST, OPTIONS, DELETE, HEAD, PATCH, PUT, ANY]. Each resource can have any, many, or none of these methods attached to it. One thing to note is that a particular verb can only be attached to a resource once. You cannot have two GETs defined for /robot or three PUTs defined for /mop because when a client makes an HTTP GET request to /robot, there would be no way to know which of the two methods they meant to invoke.

Here we see that the option to add a second GET request is not available. The console is great at preventing this, but if you are using CloudFormation, you could end up trying to deploy a template with conflicting methods and CFN is generally good about telling you which methods are conflicting on which resource within the API.

I've created a sample mock integration to demonstrate some of the next components.

* Method Request: I like to describe the method request as the client-facing incoming side of the APIGW interaction; this is the place where you (the API developer) will define authorizations needed to interact with the resource/method and define expected query strings, headers, and body.

* Integration Request: This (with the Integration Response we will look at shortly) is the meat of an APIGW API. This maps a client's request to an AWS service request by performing transformations on the request body, modifying headers and query strings, and defining what behavior should be performed when an unexpected input is received.

* Integration Response: This is where the response from the AWS service, on-prem server, or other HTTP proxy integration will be transformed into something that the client wants. Here is where you will define what HTTP status codes you expect in response from the integration endpoint and how you want to change your transformation based on those status codes.

* Method Response: Much like the method request, this is the client-facing outgoing side of the APIGW interaction. Here is where you can define specific response "shapes" ("Types" if you come from an typed-language background)

Now that we've covered some of the basics, let's dive into some request/response modeling. Suppose you had an API that was used as a simple CRUD interface for your robot collection. You'd like to be able to add robots to your collection, update their price as they become more rare, query for robots with specific attributes, and delete them when you sell or trade them with other roboticists. When we add a new robot to our collection, we want to make sure it has a serial number and generally looks like how we'd expect a robot's description (hasLaser: True, isCool: True, mass: >=0kg, etc....). If we were using the AWS_PROXY integration we would have to write custom lambda code to parse the request body and validate these fields, co-mingling the request validation logic with our business logic of CRUDing the robots.

Here I've created a simple model for a new robot and mark only the serialNumber field as required. If a mass is supplied, it must be non-negative. Let's add it to an API and poke it a few times to see what happens.

{
  "$schema" : "http://json-schema.org/draft-04/schema#",
  "title" : "CreateRobotRequest Schema",
  "type" : "object",
  "required": ["serialNumber"],
  "properties" : {
    "serialNumber" : {
        "type" : "string",
        "minLength": 10
    },
    "mass" : {
        "type" : "number",
        "minimum": 0
    }
  }
}

I've created a new mock POST method on my robot Resource and added the request model to the Method Request.

An empty request will fail because it is missing the required serialNumber. We see in the Logs dialog box that the required field is missing.

Likewise, adding a serialNumber that is too short will result in a validation failure.

Let's make a request that we expect to pass validation to make sure it's not just rejecting everything.

Great. Now let's get into the fun stuff. within the Integration Request, we can specify how we want the API to perform if the content type provided by the client doesn't match the Content-Types we've specified in our mapping templates. It's interesting to note that the default option is not the recommended option.

Before I can go much further, I have to stop using the mock integrations and the APIGW console for the demo. While the mock integration is great for experimenting or returning hard-coded responses that you're certain will never change, it leaves a bit to be desired that Ben Kehoe describe better than I could several years ago so a fix should be just around the corner. There is also some peculiarities in the way that the console differs from regular cUrl or Postman requests made to the deployed API (maybe a topic for a later blog post) so we will be working on deployed APIs from now on.  For the rest of this post, I'll be using cloudformation to generate the models and APIs so we can build a bunch of them programmatically to test various configurations. think Genetic Algorithms for deciphering API Gateway behavior.

The request body passthrough has 3 options that Alex Debrie describes really well here and I want to experiment with each in a few different scenarios. Below I outline the experiment.

Models:

model 1:
application/json (when we require an application/json model in an experiment, this is the model we will use for the definition)
{"props1": "String"}

model 2:
{"props3": "String"}

model 3:
text/plain (when we require an text/plain model in an experiment, this is the model we will use for the definition)
"String"

model 4:
{"statusCode": 200}

As far as I can tell, the "content-type" that is required when creating a model in API Gateway is completely un-used. I haven't been able to find anywhere that the the type is enforced and I imagine it is just for making the automatically generated SDKs easier to consume.

Next we will construct 6 experiments:
In these experiments I use "application/jsonx" to refer to content-types that the API author didn't specify ahead of time. I could have just as easily used "appFOOBARlication/somestring" and the results would be unchanged.

Trial 1:
* no models defined in RequestModels
* no templates defined in integration request
* passthrough behavior: NEVER

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 415 415 415 415
application/jsonx 415 415 415 415
text/plain 415 415 415 415

This makes sense, we didn't define any models and we told APIGW to reject any content type we didn't define.

Trial 2:
* no models defined in RequestModels
* no templates defined in integration request
* passthrough behavior: WHEN_NO_MATCH

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 200 200 200
application/jsonx 200 200 200 200
text/plain 200 200 200 200

Again, this makes sense. We said "pass the request along if there's no matching content-types" and we didn't identify any content types.

Trial 3:

* no models defined in RequestModels
* no templates defined in integration request
* passthrough behavior: WHEN_NO_TEMPLATES

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 200 200 200
application/jsonx 200 200 200 200
text/plain 200 200 200 200

Once again, this makes sense. We said "pass the request through if there are no templates, and all the requests were passed through"

Trial 4:
* content-type application/json and text/plain defined in integration request
* passthrough behavior: NEVER

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 400 400 400
application/jsonx 415 415 415 415
text/plain 400 400 200 400

This is where it starts getting fun. The 400 messages mean that validation failed. We told API Gateway that if the content-type is "application/json" then enforce model 1 or if "text/plain" is supplied, then enforce model 3. We see all of the application/jsonx requests being rejected as the wrong content type (415) and all of the poorly formatted requests of the correct content-type (400)

Trial 5:
* content-type application/json and text/plain defined in integration request
* passthrough behavior: WHEN_NO_MATCH

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 400 400 400
application/jsonx 200 200 200 200
text/plain 400 400 200 400

Here is another interesting example. We see the model enforcement completely bypassed by supplying an unexpected content-type (application/jsonx). Poorly formed requests with application/json or text/plain are still bound to the expected model, but application/jsonx is free have the whole request passed directly to the backend.

Trial 6:
* content-type application/json and text/plain defined in integration request
* passthrough behavior: WHEN_NO_TEMPLATES

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 400 400 400
application/jsonx 415 415 415 415
text/plain 400 400 200 400

This makes sense, given what we've seen so far. We told APIGW to only pass the requests on if there were no templates specified. Since we specified content-types, the unexpected "application/jsonx" is rejected and the poorly formatted requests are rejected.

Now let's repeat this with the Lambda "AWS_PROXY" integration and see what nonsense falls out.

Trial 1:
* no models defined in RequestModels
* no templates defined in integration request
* passthrough behavior: NEVER

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 415 415 415 415
application/jsonx 415 415 415 415
text/plain 415 415 415 415

This makes sense, we didn't define any models and we told APIGW to reject any content type we didn't define.

Trial 2:
* no models defined in RequestModels
* no templates defined in integration request
* passthrough behavior: WHEN_NO_MATCH

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 200 200 200
application/jsonx 200 200 200 200
text/plain 200 200 200 200

Again, this makes sense. We said "pass the request along if there's no matching content-types" and we didn't identify any content types.

Trial 3:

* no models defined in RequestModels
* no templates defined in integration request
* passthrough behavior: WHEN_NO_TEMPLATES

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 200 200 200
application/jsonx 200 200 200 200
text/plain 200 200 200 200

Once again, this makes sense. We said "pass the request through if there are no templates, and all the requests were passed through"

Trial 4:
* content-type application/json and text/plain defined in integration request
* passthrough behavior: NEVER

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 400 400 400
application/jsonx 200 200 200 200
text/plain 400 400 200 400

This is where the Lambda Proxy integration can get dangerous. If you were to use a regular AWS service integration, unexpected content-types would be rejected with a 415 error, if you use the AWS_PROXY integration for Lambda, unexpected content-types are passed straight through with no model validation.

Trial 5:
* content-type application/json and text/plain defined in integration request
* passthrough behavior: WHEN_NO_MATCH

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 400 400 400
application/jsonx 200 200 200 200
text/plain 400 400 200 400

No difference between this and the regular AWS service integration

Trial 6:
* content-type application/json and text/plain defined in integration request
* passthrough behavior: WHEN_NO_TEMPLATES

Content-Type/ Model model 1 model 2 model 3 model 4
application/json 200 400 400 400
application/jsonx 200 200 200 200
text/plain 400 400 200 400

Just like experiment 4, you would expect this configuration to reject application/jsonx, but you would be mistaken.

My assumption is that APIGW is doing some kind of "short-cut" behind the scenes to connect APIGW to Lambda and the Method Request stage is being completely skipped in the path through the system. This creates lower latency (which we saw in the post comparing APIGW service integration to APIGW->Lambda->DDB with James Beswick). It seems odd that the method request stage of the API is completely skipped, considering the price tag that comes with APIGW and that AWS SAM uses the AWS_PROXY integration be default. I would recommend that if you are using APIGW to integrate with a service that isn't Lambda, take the time to specify the "NEVER" request passthrough behavior instead of relying on the default "WHEN_NO_TEMPLATES" behavior.