Structural Patterns Mining in Web APIs

From OpenAPI documentation to structural patterns

By Souhaila Serbout, the 11th of March 2021

Slides are in : https://si-seminar.vercel.app/

Web Application Programing Interface


Web API


Web API Description Languages Timeline

Web API Description Languages

Sponsor Initial commit Latest stable release Stable release date Software license Format Open Source Code generation (client) Code generation (server)
RAML MuleSoft September, 2013 1.0 May 16, 2016 Apache 2.0 YAML Yes Yes Yes
API Blueprint APIary (now Oracle) April, 2013 1A9 June 8, 2015 MIT MSON Yes No No
OpenAPI Open API Initiative (OAI) July, 2011 3.1.0 February 18, 2021 Apache 2.0 JSON or YAML Yes Yes Yes

Web APIs

Web API Description Languages


  • API description languages can be thought of as the power tools of the API designer.


  • API Description Languages make it easy to describe APIs in a precise way by working as a domain specific language.


  • API Designers use API Description Languages to better capture API Design decisions.


  • API Description languages can be used to describe the behavior of an API and can be used to build an API with automated tool support such as what OpenAPI provides.

OpenAPI Specification (Swagger)

  • An OpenAPI definition can then be used by documentation generation tools to display the API, code generation tools to generate servers and clients in various programming languages, testing tools, and many other use cases.

  • A document (or set of documents) that describes an API conforming to the OpenAPI Specification metamodel.

Works that exploited OpenAPI documents

  • István Koren and Ralf Klamma. 2018.The Exploitation of OpenAPI Documentation for the Generation of Web Frontends. In Companion Proceedings of the The Web Conference 2018 (WWW '18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 781–787. DOI: https://doi.org/10.1145/3184558.3188740

  • Ed-Douibi H., Daniel G., Cabot J. (2020) OpenAPI Bot: A Chatbot to Help You Understand REST APIs. In: Bielikova M., Mikkonen T., Pautasso C. (eds) Web Engineering. ICWE 2020. Lecture Notes in Computer Science, vol 12128. Springer, Cham. https://doi.org/10.1007/978-3-030-50578-3_40

  • S. Schwichtenberg, C. Gerth and G. Engels,From Open API to Semantic Specifications and Code Adapters , 2017 IEEE International Conference on Web Services (ICWS), Honolulu, HI, 2017, pp. 484-491, doi: 10.1109/ICWS.2017.56.

  • Espinoza-Arias, P., Garijo, D., & Corcho, O. (2020, November). Mapping the Web Ontology Language to the OpenAPI Specification. In International Conference on Conceptual Modeling (pp. 117-127). Springer, Cham.

  • Bucaille, S., Izquierdo, J. L. C., Ed-Douibi, H., & Cabot, J. (2020, June). In International Conference on Web Engineering (pp. 533-537). Springer, Cham.

  • Ed-Douibi, H., Izquierdo, J. L. C., & Cabot, J. (2018, June). OpenAPItoUML: a tool to generate UML models from OpenAPI definitions. In International Conference on Web Engineering (pp. 487-491). Springer, Cham.

  • Ed-Douibi, H., Izquierdo, J. L. C., Bordeleau, F., & Cabot, J. (2019, September). WAPIml: Towards a modeling infrastructure for Web APIs. In 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C) (pp. 748-752). IEEE.

  • Mainas, N., Petrakis, E. G., & Sotiriadis, S. (2017, November). Semantically enriched open API service descriptions in the cloud. In 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS) (pp. 66-69). IEEE.

What ?

Exploit the granular description of web API descriptions in the OpenAPI files

For what ?

Extract common structures used in Web APIs ( Structural Pattern )

Why ?

  • Discover the design decision made in currently available Web APIs

  • Build a learning dataset for a web APIs design tool

Dataset Overview


The dataset is composed of 6619 OpenAPI Definitions.


Yearly distribution of mined OpenAPI Definitions

Versions of mined OpenAPI documents

Dataset Overview

OpenAPI documents sizes

APIs providers

Messenger API

Cloud Spanner is a managed, mission-critical, globally consistent and scalable relational database service.

Provider: AppsCode Inc (https://appscode.com/), Year: 2020


Go to OpenAPI file { click on me to see the full OpenAPI document}

Curated Data Index

API for indexing curated datasets with user-defined labels, and searching that index.

Year: 2017

Extracted Fragments


Go to OpenAPI file { click on me to see the full OpenAPI document}

Cloud Spanner API

Cloud Spanner is a managed, mission-critical, globally consistent and scalable relational database service.

Provider: Google, Year: Google

 
										'/v1/{resource}:getIamPolicy':								
										post:
										  description: 'Gets the access control policy for a database 
										   resource.
										   Returns an empty policy if a database exists but does not
										   have a policy set.
										   Authorization requires `spanner.databases.getIamPolicy` 
										   permission on resource.' 
										  operationId: spanner.projects.instances.databases.getIamPolicy
										  parameters:
											- in: body
											  name: body
											  schema:
												$ref: '#/definitions/GetIamPolicyRequest'
											  in: path
											  name: resource
											  required: true
											  type: string
											  x-reservedExpansion: true
										  responses:
											'200':
											  description: Successful response
											  schema:
												$ref: '#/definitions/Policy'
										 
							
Go to OpenAPI file { click on me to see the full document }

Concepts & Metrics

Concepts & Metrics

Concepts & Metrics

Definitions

API Tree

Is the output of the model transformation performed over the OpenAPI textual model.

6619 API Tree

Tree Structure

A tree structure of an API tree is the unlabeled version of the tree. Where all the labels of the nodes are moved. Only the leaves keep their method names.

6619 API Tree Structure

API Fragment

A fragment of an API is any subtree of the API tree T, having a subset of the leaves of T.

273,072 Fragments of size >= 4

70,601 Unique tree structures

Label sequence

A sequence of ordered labels extracted from the tree nodes.

66,874 Unique labels sequences
LS ={calendar, map, search}

Metrics

Occurrence

The number of times a specific fragment's tree structure appears in the whole APIs dataset .

Popularity

The number of APIs where a fragment's tree structure exist

Size

Number of nodes in a fragment tree.

Analytics Goals

Unique fragments Tree Structures extraction

Grouping fragments with similar tree structures

Context of use of Tree Structure

Calculating the semantic closeness between the labels sequences extracted from the fragments

Purpose of use of a Tree Structure

Extracting the description associated to each HTTP method

Fragments - Occurrence vs. Popularity


Fragments - Occurrence, Popularity and Size

Collection Fragments

Structural patterns classification


Pattern Variants
Read-only read-collection/item
EnumerableCollection

Read-write

Read/Write/Delete

Read/Write/Delete

Read-POST

Appendable collection

GET/PUT/DELETE

GET/DELETE

PUT/DELETE

Collection

All operations

GET/PUT/DELETE

GET/PATCH/DELETE

GET/PUT

PUT/DELETE

GET/DELETE

DELETE only

Update only

Read only

Mutable collection

Replace all - Delete one

Replace all - Read and delete one

Delete all

1. Read-only

  • Summary. Expose an immutable set of items within their own container resource.

  • Problem. How to publish one or more immutable set of related items through the API? How to make the collection items discoverable by clients?

  • Solution. Provide a unique address for each collection item.Allow clients only to read the content of each items applying the GET method to the address of the item. Group together related items under the same prefix.

    Allow clients to enumerate the items within the collection by applying the GET method to the container resource.

  • Related patterns. This pattern makes use of the Enumerable Collection pattern, with the constraint that the collection items only expose the GET method


Size: 4 — Occurrence: 1588— Distinct Labels: 744

2. Enumerable Collection

  • Summary. Use the container resource to enumerate its content.

  • Problem. How to make the collection items discoverable by clients?

  • Solution. Group together related items under the same prefix. Allow clients to enumerate the items within the collection by applying the GET method to the container resource.


Variant (a)


Size: 5 — Occurrence: 99— Distinct Labels: 43

2. Enumerable collection

  • Summary. Use the container resource to enumerate its content.
  • Problem. How to make the collection items discoverable by clients?
  • Solution. Group together related items under the same prefix. Allow clients to enumerate the items within the collection by applying the GET method to the container resource.

Variant (b) : ( Potential Antipattern )

Size: 6 — Occurrence: 176— Distinct Labels: 71

Descriptions associated the the DELETE method.

  1. Deletes the MySQL Server key with the given name
  2. Deletes the PostgreSQL Server key with the given name
  3. Deletes the log profile
  4. Deletes the specified Azure key vault
  5. Deletes the specified application security group
  6. Deletes the specified public IP address
  7. Remove a Property
  8. Remove a Resource
  9. Remove a single task
  10. Remove an episode vote
  11. The operation to delete a container service
  12. Unfollow a network
  13. Unfollow a person
  14. Unfollow a show
  15. Unfollow a webchannel
  16. Unmark an episode
  17. delete an Ad- you must own the Ad and be logged in to delete an Ad

Description associated to all the operations of the collection items

Path segment Method Description
𝑝𝑒𝑜𝑝𝑙𝑒 GET List the followed people
{𝑝𝑒𝑜𝑝𝑙𝑒_𝑖𝑑}

GET

DELETE

PUT

Check if a person is followed

Unfollow a person

Follow a person

2. Enumerable collection

Variant (b) : ( Potential Antipattern )


TvMaze user API

2. Enumerable collection

  • Summary. Use the container resource to enumerate its content.

  • Problem. How to make the collection items discoverable by clients?

  • Solution. Group together related items under the same prefix. Allow clients to enumerate the items within the collection by applying the GET method to the container resource.


Variant (c): (Potential Antipattern)


Size: 5 — Occurrence: 77— Distinct Labels: 22

Based on the descriptions, we can detect two main use cases for the POST methods on the collection items:

(1) Append an item to the collection

(2) Updating an attribute of an item

3. Appendable Collection

  • Summary. Summary.Append new items by posting them in the container resource

  • Problem. How to offer clients the ability to add new items into the collection?

  • Solution. Allow clients to use the POST method on the container resource to append new items into the collection. The address of the newly created items must be returned to the clients, since this pattern does not feature the ability for clients to enumerate the content of the collection.


Variant (a) Variant (b) Variant (c) (Potential Antipatern)

Size: 5 — Occurrence: 145— Distinct Labels: 43

Size: 5 — Occurrence:196 — Distinct Labels: 64

Size: 5 — Occurrence: 50 — Distinct Labels: 31

4. Mutable Collection Items

  • Summary. Expose mutation methods for individual collection items. However, not operations can be performed over the collection.

  • Problem. How to let clients modify the content of each item of a collection?

  • Solution. Allow clients to apply the DELETE method and PUT or POST methods on individual collection items.

Variant (a) Variant (b) Variant (c) (Potential Antipatern)

Size: 5 — Occurrence: 78 — Distinct Labels: 48

Size: 5 — Occurrence:323 — Distinct Labels: 168

Size: 5 — Occurrence: 201 — Distinct Labels: 87


The post operation in Variant (c) is used for to purposes:

  • POST operation in used to fulfill the role of PUT.

  • POST operation in used to append items to the collection.

5. Collection

Also known as: Enumerable-Appendable Collection, List/Add

Summary. Use the container resource to enumerate its content and add new items

Problem. How to make the collection items discoverable by clients?

Solution. Group together related items under the same prefix. Allow clients to enumerate the items within the collection by apply-ing the GET method to the container resource. Clients can use thePOST method on the same container resource to add new items.

SolutionGroup together related items under the same prefix.Allow clients to enumerate the items within the collection by apply-ing the GET method to the container resource. Clients can use thePOST method on the same container resource to add new items

Variant (a) Variant (b) Variant (c)
Size: 5 — Occurrence: 78 — Distinct Labels: 48 Size: 5 — Occurrence:323 — Distinct Labels: 168 Size: 5 — Occurrence: 201 — Distinct Labels: 87

5. Collection

Variant (a) Variant (b) Variant (c)
Size: 5 — Occurrence: 78 — Distinct Labels: 48 Size: 5 — Occurrence:323 — Distinct Labels: 168 Size: 5 — Occurrence: 201 — Distinct Labels: 87
Variant (e) Variant (f) Variant (g)
Size: 6 — Occurrence: 63 — Distinct Labels: 47 Size: 5 — Occurrence: 345 — Distinct Labels: 187 Size: 6 — Occurrence: 169 — Distinct Labels: 84

5. Collection

Variant (a) Variant (b) Variant (c)
Size: 5 — Occurrence: 78 — Distinct Labels: 48 Size: 5 — Occurrence:323 — Distinct Labels: 168 Size: 5 — Occurrence: 201 — Distinct Labels: 87
Variant (e) Variant (f) Variant (g)
Size: 6 — Occurrence: 63 — Distinct Labels: 47 Size: 5 — Occurrence: 345 — Distinct Labels: 187 Size: 6 — Occurrence: 169 — Distinct Labels: 84
Variant (h) Variant (i) Variant (j)
Size: 5 — Occurrence: 232 — Distinct Labels: 139 Size: 8 — Occurrence: 328 — Distinct Labels: 159 Size: 7 — Occurrence: 1123 — Distinct Labels: 574

Spotifree

A REST API with Flask backend for managing music playlists

Conclusion: Structural Patterns Mining in Web APIs