API First Concepts and Problems

Introduction

APIs are displaying continued astronomical growth (see https://www.postman.com/state-of-api/api-global-growth/#api-global-growth) and are now a critical part of nearly all modern architecture. However, we have seen a lot of coverage recently of the different terms encompassed by the API First methodology, especially as people try to use these buzz-words to generate excitement over their products without fully understanding what they mean.

In IT, terms tend to evolve, due to the pace of change. But this is not the ideal way to create terms as it can lead to similar terms, causing sub-optimal communication, even mistakes. But unlike industries like aviation and maritime, which are the gold standard for communication, w do not set a standard and declaration for what precise terms to use and when, to prevent confusion, misunderstanding and planes and boats crashing into each other.

For instance, we now have the following:

(Yes, I know that some of these are unrelated to APIs, however there is a real problem with language here – everything sounds the same and confuses people)

  • “API First”
  • “Code First”
  • “Data First”
  • “Design First”
  • “API First Design”
  • “Business First API Design”
  • “Api First Culture”
  • “Api First Company”
  • “Api Strategy”
  • “API as a Product”
  • … and the list goes on…

I don’t know about you, but my head is rattling from all of these variants. This is getting ridiculous and incredibly confusing for everyone, requiring a lot of reading of articles to understand the nuances of the new terms that are being introducing and how these may subtly differ from the others.

As technologists, we need the business community to fully grasp the ideas and benefits behind the API first methodology so that we and they can make sound judgements on directions and strategies. This means that the ideas need to be immediately comprehensible and their ramifications obvious.

Rubbish pile
Image: https://www.onmanorama.com/lifestyle/news/2022/08/10/eliminate-impact-single-use-plastic-products.html

I am going to suggest that we throw out a lot of these phrases and keep things simple, but I won’t try to reinvent the wheel and create completely new phrases. For the rest of this article, I will only refer to 3 terms.

API First concepts

Each term has an impact on the technical and business strategies/successes in different measures. But each concept requires a full “buy-in” and understanding from the business.

API First

  • Design the and build the API first!
  • The product is built around the API.
  • This is reflected in the business strategies and revenue models (see API as a product and API design 1st).

API-first is obvious from the business point of view. The API is the primary income generator or essential to other income product/s. Without a revenue model, you do not have a business. Therefore if APIs are the main product or critical to a product that is being created/overhauled, then the API becomes a crucial first-class citizen, not just for the technical dept, but also from the business.

Information and data are key to the modern business, right? APIs are the backbone for nearly all data communication.

Building a new product (not just API) is an expensive task, both in time and money. By going through a full design phase for the API, the correct emphasis and attention is given to the API that it needs, a lot of thought will be put into:

  • What data is being shared across the API.
  • Why this data sharing is required or wanted.
  • Who/what will be consuming the data (internal and/or external).
    • This will reveal any potential flaws or opportunities in assumptions that the business has made.
  • Ensuring that all business processes are made more efficient by the new agile data flow that APIs will provide.
  • Performance.
  • Reliability.
  • Scalability
  • Usability.
  • pricing.
  • Future changes/versioning.

n return, API First will have a causal effect on the business by ensuring that due diligence and research is followed before embarking on a costly venture of system build. A lot of thought will be put into what the business needs and why. Future projections and plans become much clearer.

This does not mean that all other resources sit in idle-mode until the API is built, in fact it can happen simultaneously. Once you have the API model built in documentation like OpenAPI or Redoc (quite early in the process), you have a model that can be used by consumers before you start the full development. This saves time and money on fixing broken assumptions, missing or inadequate requirements.

API as a Product

  1. The API may be the core product, but not necessarily.
  2. API as a product has 2 definitions:
    1. External-facing API that is monetised
    2. Indirect monetisation by a product that requires an API to function or public exposure and usage through a free public API.

API as a product, is tightly coupled to the API design-first concept. In order to implement business strategies for revenue driving APIs, good design and documentation is a “must-have” to enable third parties to consume the APIs, providing an easy-to-use, comprehensive and unified experience.

If the API product is well designed and documented, this will increase the consumption and efficiency of the API (and therefore income). Postman alone has seen very significant increases in traffic and API usage, despite the global downturn and Covid:

Since the 2021 State of the API report was released, the Postman API Platform saw significant surges in use:
* Users: 20 million
* Collections created: 38 million
* Requests created: 1.13 billion

https://www.postman.com/state-of-api/api-global-growth/#api-global-growth

Api First Design

In general, this is a development model for APIs that ensures the following criteria are met:

  • Reach agreement from all stakeholders.
  • Design the APIs models first on OpenAPI or similar product.
  • Build the API.
  • Separate products can be built at the same time as the API.

API design first may not be quite so obvious from the business POV, because at first view, it is just a development process. However, because it requires approval from all stakeholders, it is critical to have business “buy-in” to the process. This enables all of the business to see what/why the API is being built, and ensures that everyone is agreed on the most critical features they need.

In addition, this results in a rare instance of a complete understanding of what is being built and why, by the technical department. Instead of just being told to “build this” by the business without any consultation between them on what the rationale is, what is technically feasible, or if there is a better solution, the developers and department see the vision and are able to work towards that vision in a much more efficient manner and the end result will be a much better product.

Conclusion

This all ties in with discoveries and revelations being made about “Developer Experience”.

Up until recently, a lot of research and time has been invested un user experience (UX) and there is no denying the value that this can contribute to the business by increasing traffic and revenue. However a new key realisation is being made about developer experience (DevX). Who will be directly interacting with and consuming the APIs (and therefore making critical decisions that will impact their business about what APIs to consume and the frequency)? That right, developers!

They will correctly discard badly documented or difficult to use APIs in favour of better APIs. This means your product will be directly impacted by the DevX experience!

If you are supplying an external facing API for revenue (direct with monetisation or indirectly by offering APIs free with the associated traffic, publicity and exposure), then DevX is crucial to maximise income.

If you are supplying an internal facing API, then good DevX will have an impact on associated systems design, development and maintenance. Business processes that interact with the API will be either made more efficient by a well designed API or made cumbersome slow and inefficient by bad APIs.

1.0.0-alpha3 Release

After finishing and closing a large chunk of tickets for the tickets that we have planned for the beta release, we had a minor panic…

Api Open Studio Admin has largely been neglected (because it was just supposed to be a quick fix MVP and will be completely replaced before v1.0.0) while we focused on the backbone part of this project: the API (Api Open Studio). However, it was no longer compiling, due to package.json issues and it needed updating to utilise and match changes in the API core resources that it consumed.

This has been quickly resolved and new release tags added so that are now on packagist:

If you are updating an existing instance of Api Open Studio, please make sure that you run the updates:

  • Log-in to the server that contains the API code
  • run ./includes/scripts/update.php

In addition,Api Open Studio Docker Dev has been updated to also implement PHP7.4 or PHP8.0.

Summary of changes in 1.0.0-alpha3:

  • Wholesale changes in the wiki
  • Changed the token auth to JWT tokens.
  • Updated gitlab-ci:
    • Use the new naala89/bookdown-rsync, naala89/phpdoc-rsync, naala89/apiopenstudio-nginx-php-7.4 and naala89/apiopenstudio-nginx-php-8.0 images.
    • Fixed gitlab runner artifacts.
    • Tests run on all merge requests and deploy to wiki/phpdoc on merges.
  • Deprecated Cascade logger and created a wrapper for Monolog.
  • Removed bookdown/bookdown from the composer dev dependencies.
  • Deprecated the Mapper processors.
  • Created new JsonPath and XmlPath processors.
  • Added functional tests for user and role.
  • Created new traits for datatype conversion.
  • Implemented casting on all input vars like VarPost.
  • Create/update CRUD processors now return the value result, rather than true/false.
  • Deprecated the Mapper processors.
  • New JsonPath processor.
  • New XmlPath processor.
  • New Cast processor.
  • Automated tests now for ApiOpenStudio on PHP7.4 & PHP8.0.
  • New code to make globally converting data types in processors easy.
  • You can now specify the expected input data type (automatically cast) in the following processors with a new expected_type attribute:
    • var_post
    • var_get
    • var_uri
    • var_request
    • var_body

Major change to the ApiOpenStudio repository location

In order to implement pipelines and docker, with automated builds of docker images, the ApiOpenStudio projects have all been added to a new ApiOpenStudio group in GitLab.

This will enable GitLab pipelines to orchestrate pipelines across all of the projects as code is pushed and merged.

There was a dependency on this for upcoming tickets and tasks, so the tasks could not be delayed any longer. Because of this change, we have merged the develop branch to master branch, because this will update the wiki and phpdoc to reflect these changes.

However a new release tag for packagist has not been generated at this stage, becase we are only a few tasks away from beta release.

New changes available in the master branch:

  • GitLab CI pipelines now faster, (#118 – closed).
  • Wiki pages updated (#118 – closed & #115 – closed).
  • Fixed CI artefacts not being uploaded on failure (#117 – closed).
  • Logging now works on PHP8.0 as well as PHP7.4 (#111 – closed).
    • This involved deprecating Cascade, and creating a wrapper for the awesome Monolog package.
  • Implemented full JWT token authentication (#101 – closed).
  • Fix automated unit and functional tests (#110 – closed).
  • The entire project code has been updated to ensure all the latest PHPdoc and coding standards are passed.
  • Fixed Packagist for apiopenstudio_admin – sorry, this was my bad – it was a copy and paste error that went unnoticed.

Contributors and developers using the codebase

If you have a clone of the Gitlab repository, you will need to update your remote branch with the following command (assuming you have cloned with SSH):

git remote set-url origin git@gitlab.com:apiopenstudio/apiopenstudio.git

If you have a clone of the GitLab repository, you will need to update your remote branch with the following command (assuming you have cloned with SSH):

git remote set-url origin git@github.com:naala89/apiopenstudio.git

If you have forked the Gitlab repository, you can update the upstream URL:

git remote set-url upstream git@gitlab.com:apiopenstudio/apiopenstudio.git

The updated URLs

The new Group URL’s

The GitLab project URL’s

The GitHub mirror URL’s

Exciting upcoming features for the Beta release

  • Unit and Functional testing for PHP8.0 to ensure working across all contemporary PHP versions.
  • Composer 2.0 should be fine, but this should be tested before Beta release.
  • Swagger processor will be brought up to dat and fixed to allow importing and exporting of OpenApi documents.
  • Automated tagging and generation of an ApiOpenStudio Docker image

A Deep Dive into JWT Tokens in ApiOpenStudio Auth

Introduction

After a ton of work, over the next week or two we will be introducing JWT tokens for authentication in ApiOpenStudio.

This will:

  • Significantly increase the speed of resource requests.
  • Make individual transactions stateless.
  • Maintain the granular access rights to resources, based on the user’s access rights, the resource’s holding account & application, and of course the resource itself.
  • Ensure viability and ease for enterprise clients to use 3rd party authorisation services.

This article will take a look into the JWT specification, current practices and how it is used in ApiOpenStudio.

Rationale

While trying to optimise the authentication DB queries that are performed before the resource was processed, we came to the realisation that the queries were quite long and involved the joining of multiple tables and had several sub-queries so that the query could take into account our extensive range of access rights, like Administrators, Account managers, Applications Managers, Developers and Consumers. Although this would obviously be faster in a production environment, this was causing 1-2 seconds additional processing time to calls in development environments…

Because this had to be calculated every time a resource was called!

The solution was to introduce stateless authorisation, in the form of JWT tokens.

JWT is rapidly becoming the industry standard for authentication and JWT tokens have what are called Claims, which are individual name/value pairs within the body of the token and the token itself is encrypted and secure, which means that sensitive data can be securely included in the token and thus means that the user’s roles and permissions can be included as a claim and only need to be fetched once (during the JWT token GET call).

In addition, JWT tokens have a TTL, which means that we do not need to store a bearer token in the DB against each user (use that to fetch the user that each time a request is received) – because if the token is valid, then there is no need to fetch the user and check the bearer token TTL.

Here are some scenarios where JSON Web Tokens are useful:

Authorisation: This is the most common scenario for using JWT. Once the user is logged in, each subsequent request will include the JWT, allowing the user to access routes, services, and resources that are permitted with that token. Single Sign On is a feature that widely uses JWT nowadays, because of its small overhead and its ability to be easily used across different domains.

Information Exchange: JSON Web Tokens are a good way of securely transmitting information between parties. Because JWTs can be signed—for example, using public/private key pairs—you can be sure the senders are who they say they are. Additionally, as the signature is calculated using the header and the payload, you can also verify that the content hasn’t been tampered with.

auth0.com, JSON Web Token Introduction – jwt.io, viewed 1 September 2021, https://jwt.io/introduction.

How does JWT work?

Pronunciation

Before we start, for a bit of fun I’d like to set the record straight. I’m not one of those boring “absolutists” who insist that GIF should be pronounced “JIF”, even though the original creator of the tech obviously could not spell when he declared that was the pronunciation. But many people are driving me bananas, by mispronouncing JWT: just pronounce it as “JOT”, it’s also easier to say than the most common variant “Jay-Dubbyah-Tee” (this is not a relative to a former US president).

The suggested pronunciation of JWT is the same as the English word “jot”.

Jones M, Microsoft, Bradley J, Ping Identity, Sakimura N, NRI 2020, JSON Web Token (JWT), Internet Engineering Task Force (IETF),viewed 1 September 2021, https://datatracker.ietf.org/doc/html/rfc7519#section-1.

JWT overview

Now for the technical stuff…

JWTs represent a set of claims as a JSON object that is encoded in a
JWS and/or JWE structure. This JSON object is the JWT Claims Set.
As per Section 4 of RFC 7159 [RFC7159], the JSON object consists of
zero or more name/value pairs (or members), where the names are
strings and the values are arbitrary JSON values. These members are
the claims represented by the JWT. This JSON object MAY contain
whitespace and/or line breaks before or after any JSON values or
structural characters, in accordance with Section 2 of RFC 7159
[RFC7159].


The member names within the JWT Claims Set are referred to as Claim
Names. The corresponding values are referred to as Claim Values.
The contents of the JOSE Header describe the cryptographic operations
applied to the JWT Claims Set. If the JOSE Header is for a JWS, the
JWT is represented as a JWS and the claims are digitally signed or
MACed, with the JWT Claims Set being the JWS Payload. If the JOSE
Header is for a JWE, the JWT is represented as a JWE and the claims
are encrypted, with the JWT Claims Set being the plaintext encrypted
by the JWE. A JWT may be enclosed in another JWE or JWS structure to
create a Nested JWT, enabling nested signing and encryption to be
performed.


A JWT is represented as a sequence of URL-safe parts separated by
period (‘.’) characters. Each part contains a base64url-encoded
value. The number of parts in the JWT is dependent upon the
representation of the resulting JWS using the JWS Compact
Serialization or JWE using the JWE Compact Serialization.

Jones M, Microsoft, Bradley J, Ping Identity, Sakimura N, NRI 2020, JSON Web Token (JWT), Internet Engineering Task Force (IETF),viewed 1 September 2021, https://datatracker.ietf.org/doc/html/rfc7519#section-3

What does this mean?

A token is comprised of 3 parts, the header, payload and signature.

The header of the token defines the cryptography applied to the payload, and thepayload is a JSON structure, that is encoded in JWS (Base64 encoded) or JWE (encrypted).

JWS is less secure since it is only a Base64 encoded JSON, and this means it is not suitable in our case. Because the token will carry authentication data for access to the resource and Base64 encoding is different to encryption – it is relatively trivial to decode.

JWE is more suitable to our case, because the JSON payload is encrypted (as defined in the header), and the JWT token keys are stored securely on the on the ApiOpenStudio server, so only the API server can decrypt the body.

There are multiple encryption standards available, and in our case we are using the fantastic lcobucci-jwt library, which is sponsored by one of the leading Authorisation services: Auth0. This provides support for many, many symmetric and asymmetric algorithms.

We have not implemented symmetric algorithms (these are less secure and are the same key for encryption and decryption), so with asymmetric algorithms, the public key can be used by any clients, and the public/private keys are stored in a secure location on the ApiOpenStudio server.

JWT token structure

In its compact form, JSON Web Tokens consist of three parts separated by dots (.), which are:

  • Header
  • Payload
  • Signature

Therefore, a JWT typically looks like the following.

xxxxx.yyyyy.zzzzz

Header

The header typically consists of two parts: the type of the token, which is JWT, and the signing algorithm being used, such as HMAC SHA256 or RSA.

For example:

{
  "alg": "RSA256",
  "typ": "JWT"
}

Then, this JSON is Base64Url encoded to form the first part of the JWT.

Payload

The second part of the token is the payload, which contains the claims. There are three types of claims: registeredpublic, and private claims.

  • Registered claims: These are a set of predefined claims which are not mandatory but recommended, to provide a set of useful, interoperable claims. Some of them are: iss (issuer), exp (expiration time), sub (subject), aud (audience). Notice that the claim names are only three characters long as JWT is meant to be compact.
  • Public claims: These can be defined at will by those using JWTs. But to avoid collisions they should be defined in the IANA JSON Web Token Registry or be defined as a URI that contains a collision resistant namespace.
  • Private claims: These are the custom claims created to share information between parties that agree on using them and are neither registered or public claims.

An example payload could be:

{
  "iss": "my.apiopenstudio.com",
  "sub": "1234567890",
  "name": "John Dory",
  "admin": true
}

The payload is then Base64Url encoded to form the second part of the JSON Web Token.

Signature

To create the signature part you have to take the encoded header, the encoded payload, a secret, the algorithm specified in the header, and sign that.

The signature is used to verify the message wasn’t changed along the way, and in the case of tokens signed with a private key, it can also verify that the sender of the JWT is who it says it is.

Using JWT tokens

The finalxxxxx.yyyyy.zzzzz token is sent in the request as a bearer token in the request header, e.g.:

Authorization: Bearer <token>

When the request is received by the API server, it will first confirm that the token is valid, checking it can be decrypted, the issuer, expiry date and mandatory claims.

If this all passes, the processing can continue, otherwise a 401 error response is sent and the client will need to generate a new token, using the provided core token request (auth/token) and resend the request with the token that was received.

From a processes POV, this is exactly the same as before. However, as mentioned, we are using asymmetric encoding, so we can include data in the token payload that means that ApiOpenStudio does not need too fetch user data in order to validate the user’s permissions against the resource. That only needs to be fetched once – when the token is generated.

Minor issue (caveat)

The original authorisation tokens were stateful, i.e. the token was stored against the user, along with the TTL for the token. This meant that if a user was banned, deleted or made inactive, they would instantly not be able to make any further API requests, due to either the users not being present/active anymore, or the token was no longer valid.

JWT tokens are stateless. This means that if a user has a valid token, they can still use it until it expires regardless of whether they are made inactive or deleted (they would only be prevented from fetching a fresh token).

This can be mitigated by setting a global jwt_life of less than 1 hour (the default), however this needs to be balanced against the increase in requests, since a lower token TTL will lead to more requests due to tokens frequently passing their expiry date more frequently.

Another mitigation for extreme cases can be to add the IP address of the client to the blacklist – this will prevent all future calls from that location, immediately.

3rd party authorisation integration

Because the token requires certain client data to be present, the user details and roles will need to be accessible by the authorisation provider, so that they can generate a valid body. Thankfully, most reputable providers allow you to upload these details to your account with them, and also provide ways to ensure that these details are always current and up to date.

You will need to ensure that the following mandatory claims:

  • iss – JWT issuer (your auth provider)
  • aud – permitted for (your api)
  • iat – JWT issued time
  • exp – JWT expiry time

The following ctsom claims are also included

  • uid – user ID
  • roles – complete list of roles and accounts/applications that the user is associated with

The roles object

This is in the JSON object format of:

roles: [
    {
        "role_name": <role_machine_name>,
        "accid": <account_id>,
        "appid": <application_id>
    }
]

For example:

roles: [
    {
        "role_name": "administrator",
        "accid": null,
        "appid": null
    },
    {
        "role_name": "consumer",
        "accid": 34,
        "appid": 5
    },
    {
        "role_name": "developer",
        "accid": 34,
        "appid": 5
    },
    ....
]

Note that:

  • “administrator” does not require accid or appid
  • “account_manager” does not require appid

Summary

We’re really excited to be implementing this technology, seeing the decrease in resource processing time, increasing the security of the API, and making it even easier for enterprise scale users to implement ApiOpenStudio on a large scale.

I’ll be honest with you too, it was really good fun to implement, and we totally got a total nerd-on, doing the research and coding for this!

Increased security and speed with JWT tokens

Current dev work is almost complete for implementing authorisation with JWT token for all resources! This will be part of the upcoming Beta release.

The ticket can be viewed in Gitlab.

This will replace the existing alpha version of a custom token and token TTL for each user in the user table.

It is quite important to note, before we move on, that JWT tokens are a different thing to oauth2, implicit grant, explicit grant, application grant and PKSE authorisation flow. JWT is only a standard for tokens. If you need to implement oauth2 or other similar workflows this is separate from the JWT implementation.

The problem

The problem with the former approach, was that resource requests had to make DB calls to the user, user_roles, roles, account and application tables in order to verify user permissions to that particular resource, FOR EVERY API CALL. This obviously negatively impacted performance for API calls.

This also meant that authorisation was not easily scalable to authorisation servers for enterprise implementation, because the implementation of the token and authorisation for API calls was tightly coupled to the ApiOpenStudio database and several of its tables.

The solution

Although the former approach was stateful (it maintained login state, so users could login and out), the stateless JWT token approach means that the token does not need to be stored in the database. The downside of stateless JWT tokens, is that there is no logout state. So if a user’s access is revoked, they will still have access to resources until their current token goes stale.

However, this can be mitigated by making the JWT token lifetime short in the ApiOpenStudio configuration.

Each JWT token contains custom claims for user ID and all roles that that user has. So when the initial request is received by ApiOpenStudio, it just decrypts the token and validates the user’s roles against the resources account/application and permissible user roles (i.e. Does the current user have the required role access to the account & application?).

Knock-on effects

The following processors have been retired:

  • user_login.
  • user_logout.

Nearly all core resources have been updated use the new processors:

  • generate_token (generate a valid JWT token for a user, with custom claims: uid, user roles).
  • validate_token (validate the Authorization token as a valild JWT token).
  • validate_token_roles ((validate the Authorization token as a valild JWT token and also validation the user has the correct role permissions for the resource).
  • bearer_token (not used by core atm, but preserved for any processors that need access to the bearer token).

Processors have been optimised, now that they do not need to do any pre-validation on who can do what – this is left to the core resource definitions.

Tests are updated to incorporate the changes, and also now have multiple test users with different roles.

The good news

Not only has this significantly improved the API response time, it has now made the API much more scalable for enterprise. We communicated and researched several major 3rd party authorisation services, including auth0, to make sure that the decision to move to JWT tokens and custom claims would still be viable if a 3rd party auth server was used.

Most 3rd party authorisation services implement linking into external databases, so that would take the heat off the api server for token generation, and allow the token generation to be completely decoupled from ApiOpenStudio. This will be the subject of a future post.

WP Twitter Auto Publish Powered By : XYZScripts.com