swissdatasciencecenter / jsonld4s   0.15.0

Apache License 2.0 GitHub

Scala Circe Extension for JSON-LD

Scala versions: 2.13 2.12

json-ld

This is a Scala library to work with JSON-LD. It's build on top of Circe and follows the design choices of its API.

API

The library comes with an API allowing creation of the following JSON-LD structures:

  • values; all the factories can be found in io.renku.jsonld.JsonLD.fromXXX where XXX stands for a common value types like Int, Boolean, String, etc. plus java.time.Instant, java.time.LocalDate, scala.Option and io.renku.jsonld.EntityId;
  • entities; the io.renku.jsonld.JsonLD.entity comes in a few variations to meet various user needs;
  • edges; io.renku.jsonld.JsonLD.edge;
  • arrays: io.renku.jsonld.JsonLD.arr;
  • entity IDs; io.renku.jsonld.EntityId.of and io.renku.jsonld.EntityId.blank for generating blank node IDs;
  • named graphs; io.renku.jsonld.NamedGraph and io.renku.jsonld.NamedGraph.from controlling effects happening on instantiation;
  • default graphs; io.renku.jsonld.DefaultGraph and io.renku.jsonld.DefaultGraph.from controlling effects happening on instantiation.

Both NamedGraph and DefaultGraph can be instantiated with a list of entities and/or edges. In cases when they supposed to be instantiated with a generic list of JsonLD objects, the from factory controlling effects should be used. The factory checks if all the objects are either entities or edges and returns a failure if at least one object is of different type.

Encoding to JSON-LD

The library allows to encode any Scala object to JSON-LD. Encoders for common types like String, Int, Long , java.time.Instant and java.time.LocalDate are provided by the library. There are also facilities for encoding objects, arrays of objects, options, lists and sets. All the mentioned tools exists in the io.renku.jsonld.JsonLD object.

Example:

import io.renku.jsonld._

JsonLD.fromInt(1)
JsonLD.fromOption(Some("abc"))
JsonLD.arr(JsonLD.fromString("a"), JsonLD.fromString("b"), JsonLD.fromString("c"))
JsonLD.entity(
  EntityId of "http://entity/23424",
  EntityTypes of (Schema.from("http://schema.org") / "Project"),
  Schema.from("http://schema.org") / "name" -> JsonLD.fromString("value")
)

Improved readability can be achieved by providing encoders for the types together with implicits from the io.renku.jsonld.syntax package.

Example:

import io.renku.jsonld._
import io.renku.jsonld.syntax._

val schema: Schema = Schema.from("http://schema.org")

final case class MyType(name: String)

implicit val myTypeEncoder: JsonLDEncoder[MyType] = JsonLDEncoder.instance { entity =>
    JsonLD.entity(
      EntityId of "http://entity/23424",
      EntityTypes of (schema / "Project"),
      schema / "name" -> entity.name.asJsonLD
    )
  }
  
 MyType(name = "some name").asJsonLD

Decoding

Decoding is an operation which allows extraction of an object from JSON-LD.

Note: When you decode from flattened JSON-LD, you have to decode to a list of your object type.

import examples.ExampleSchemas.schema
import io.renku.jsonld.syntax._
import io.renku.jsonld.{EntityId, EntityTypes, JsonLD, JsonLDDecoder}

private implicit val userDecoder: JsonLDDecoder[User] = JsonLDDecoder.entity(EntityTypes.of(schema / "Person")) {
  cursor =>
    cursor.downField(schema / "name").as[String].map(name => User(name))
}

JsonLD
  .entity(EntityId of "https://example.com/1234",
    EntityTypes of schema / "Person",
    schema / "name" -> "Angela".asJsonLD
  )
  .cursor
  .as[User]

Conditional decoding

jsold4s allows defining predicates on Entity Decoders. Predicates along Entity Types work as filters in the decoding process so if there's an Entity with correct type but condition encoded in the predicate is not met on it, the entity will be skipped. An example where predicates come handy might be situation of multiple implementation of a type where each implementation differs in properties or property value.

Predicates are functions of type Cursor => JsonLDDecoder.Result[Boolean] to give user great flexibility. An example of a predicate verifying if a schema:name property on an Entity matches the value might be:

val predicate: Cursor => JsonLDDecoder.Result[Boolean] =
  _.downField(schema / "name").as[String].map(_ == "some arbitrary name")

For more details see: src/examples/scala/examples/ConditionalDecoding.scala

Cacheable entities

Decoding big json-ld payloads into model classes may sometimes result in less satisfying performance. The library gives a tool for specifying, so called Cacheable Entities. Entities marked as cacheable are put into internal cache when decoded for the first time. Later, when the same entity is found in other places in the model hierarchy, the decoded instance is taken from the cache rather than being decoded again. Marking an Entity as cacheable might give benefits, however, the feature has to be used wisely as adding all entities to the cache, even if they occur in the payload only once, may lead to performance which is even worse than if there's no cache at all. The reason for that is the size of the cache. If size of the cache grows, the cache simply becomes slower. So a reasonable approach is to mark cacheable only these Entities which instances either occur many times in the payload or at least twice but they are very costly to decode.

Defining a cacheable entity is simple. It requires JsonLDDecoder.cacheableEntity instead of JsonLDDecoder.entity. The signatures of both factories are the same.

Parsing

Allows for parsing from circe-json or a string to a JsonLD object. See src/examples/scala/examples/BasicDecoding.scala .

Flattening

Flattening allows for data de-nesting in order to compress the data and have all entities in a single array. A failure is returned if run on JSON-LD containing multiple entities with the same @id but different content.

val nestedJson = json"""..."""
nestedJson.asJsonLD.flatten

Nested Json

{
  "@id": "http://example.org/projects/46955437",
  "@type": "http://schema.org/Project",
  "http://schema.org/member": [
    {
      "@id": "http://example.org/users/82025894",
      "@type": "http://schema.org/Person",
      "http://schema.org/name": {
        "@value": "User1"
      }
    }
  ],
  "http://schema.org/name": {
    "@value": "MyProject"
  }
}

Flattened Json

[
  {
    "@id": "http://example.org/projects/46955437",
    "@type": "http://schema.org/Project",
    "http://schema.org/member": [
      {
        "@id": "http://example.org/users/82025894"
      }
    ],
    "http://schema.org/name": {
      "@value": "MyProject"
    }
  },
  {
    "@id": "http://example.org/users/82025894",
    "@type": "http://schema.org/Person",
    "http://schema.org/name": {
      "@value": "User1"
    }
  }
]

Supported types

The following types are supported for both encoding and decoding:

  • String
  • Int
  • Long
  • Instant
  • LocalDate
  • EntityId
  • JsonLD
  • Boolean
  • List
  • Set
  • Seq
  • Option

Encoding to Json

Every JsonLD object can be turn to Json using it's toJson method.

Detailed examples

For More examples, please see src/examples/scala/examples

Ontology

JsonLD4s library allows defining and generating ontologies.

Defining ontology

Defining ontology of a specific type should be done using the io.renku.jsonld.ontology.Type class. An example definition can look like follows:

import io.renku.jsonld.ontology._
import io.renku.jsonld.Schema

val prov:   Schema = Schema.from("http://www.w3.org/ns/prov", separator = "#")
val renku:  Schema = Schema.from("https://swissdatasciencecenter.github.io/renku-ontology", separator = "#")
val schema: Schema = Schema.from("http://schema.org")

val subtypeOntology: Type = Type.Def(
  Class(schema / "Thing"),
  DataProperty(schema / "name", xsd / "string")
)

val rootOntology: Type = Type.Def(
  Class(prov / "Activity"),
  ObjectProperties(
    ObjectProperty(renku / "parameter", subtypeOntology)
  ),
  DataProperties(DataProperty(prov / "startedAtTime", xsd / "dateTime"),
                 DataProperty(prov / "endedAtTime", xsd / "dateTime")
  )
)

Type's class needs to be defined with the Class type, properties linking other types with a collection of ObjectProperty objects and simple value properties with a collection of DataProperty objects. The library calculates properties' ranges and domains automatically during ontology generation.

Generating ontology

Generating ontology is a trivial task which can be done using the io.renku.jsonld.ontology.generateOntology method. The method takes a Type definition and a Schema.

import io.renku.jsonld.ontology._
import io.renku.jsonld.Schema

val renku:  Schema = Schema.from("https://swissdatasciencecenter.github.io/renku-ontology", separator = "#")
val schema: Schema = Schema.from("http://schema.org")

val ontology: Type = Type.Def(
  Class(schema / "Thing"),
  DataProperty(schema / "name", xsd / "string")
)

generateOntology(ontology, renku)