1

How to implement the function normalize(type: Type): Type such that:

A =:= B if and only if normalize(A) == normalize(B) and normalize(A).hashCode == normalize(B).hashCode.

In other words, normalize must return equal results for all equivalent Type instances; and not equal nor equivalent results for all pair of non equivalent inputs.

There is a deprecated method called normalize in the TypeApi, but it does not the same.

In my particular case I only need to normalize types that represent a class or a trait (tpe.typeSymbol.isClass == true).

Edit 1: The fist comment suggests that such a function might not be possible to implement in general. But perhaps it is possible if we add another constraint: B is obtained by navigating from A. In the next example fooType would be A, and nextAppliedType would be B:

import scala.reflect.runtime.universe._
sealed trait Foo[V]
case class FooImpl[V](next: Foo[V]) extends Foo[V]

scala> val fooType = typeOf[Foo[Int]]
val fooType: reflect.runtime.universe.Type = Foo[Int]

scala> val nextType = fooType.typeSymbol.asClass.knownDirectSubclasses.iterator.next().asClass.primaryConstructor.typeSignature.paramLists(0)(0).typeSignature
val nextType: reflect.runtime.universe.Type = Foo[V]

scala> val nextAppliedType = appliedType(nextType.typeConstructor, fooType.typeArgs)
val nextAppliedType: reflect.runtime.universe.Type = Foo[Int]

scala> println(fooType =:= nextAppliedType)
true

scala> println(fooType == nextAppliedType)
false

Inspecting the Type instances with showRaw shows why they are not equal (at least when Foo and FooImpl are members of an object, in this example, the jsfacile.test.RecursionTest object):

scala> showRaw(fooType)
val res2: String = TypeRef(SingleType(SingleType(SingleType(ThisType(<root>), jsfacile), jsfacile.test), jsfacile.test.RecursionTest), jsfacile.test.RecursionTest.Foo, List(TypeRef(ThisType(scala), scala.Int, List())))

scala> showRaw(nextAppliedType)
val res3: String = TypeRef(ThisType(jsfacile.test.RecursionTest), jsfacile.test.RecursionTest.Foo, List(TypeRef(ThisType(scala), scala.Int, List())))

The reason I need this is difficult to explain. Let's try:

I am developing this JSON library which works fine except when there is a recursive type reference. For example:

sealed trait Foo[V]
case class FooImpl[V](next: Foo[V]) extends Foo[V]

That happens because the parser/appender it uses to parse and format are type classes that are materialized by an implicit macro. And when an implicit parameter is recursive the compiler complains with a divergence error. I tried to solve that using by-name implicit parameter but it not only didn't solve the recursion problem, but also makes many non recursive algebraic data type to fail. So, now I am trying to solve this problem by storing the resolved materializations in a map, which also would improve the compilation speed. And that map key is of type Type. So I need to normalize the Type instances, not only to be usable as key of a map, but also to equalize the values generated from them.

4
  • Why do you think such function is possible? I guess we are not supposed to rely on equals/hashCode for Types (on contrary to =:=). In the deprecation notice for standard normalize it's recommended to use dealias or etaExpand instead. Why are they not enough in your use case? Please provide several examples of input and desirable output for such hypothetical custom function normalize. Commented Nov 10, 2020 at 19:50
  • 1
    Generally this sounds like XY. Please describe what you're actually doing and what you need? Commented Nov 10, 2020 at 20:12
  • Your normalize looks like selecting a representative in every equivalence class with respect to the relation =:= on Types (link :) But it's not clear how to prefer a single representative in every equivalence class. Commented Nov 10, 2020 at 20:18
  • @DmytroMitin If I understood you well, any equivalence class would be fine. There is no preference. Commented Nov 10, 2020 at 21:30

1 Answer 1

1

If I understood you well, any equivalence class would be fine. There is no preference.

I suspect you didn't. At least "any equivalence class would be fine", "There is no preference" do not sound good. I'll try to elaborate.

In math there is such construction as factorization. If you have a set A and equivalence relation ~ on this set (relation means that for any pair of elements from A we know whether they are related a1 ~ a2 or not, equivalence means symmetricity a1 ~ a2 => a2 ~ a1, reflexivity a ~ a, transitivity a1 ~ a2, a2 ~ a3 => a1 ~ a3) then you can consider the factor-set A/~ whose elements are all equivalence classes A/~ = { [a] | a ∈ A} (the equivalence class

[a] = {b ∈ A | b ~ a}

of an element a is a set consisting of all elements equivalent (i.e. ~-related) to a).

The axiom of choice says that there is a map (function) from A/~ to A i.e. we can select a representative in every equivalence class and in such way form a subset of A (this is true if we accept the axiom of choice, if we don't then it's not clear whether we get a set in such way). But even if we accept the axiom of choice and therefore there is a function A/~ -> A this doesn't mean we can construct such function.

Simple example. Let's consider the set of all real numbers R and the following equivalence relation: two real numbers are equivalent r1 ~ r2 if their difference is a rational number

r2 - r1 = p/q ∈ Q

(p, q≠0 are arbitrary integers). This is an equivalence relation. So it's known that there is a function selecting a single real number from any equivalence class but how to define this function explicitly for a specific input? For example what is the output of this function for the input being the equivalence class of 0 or 1 or π or e or √2 or log 2...?

Similarly, =:= is an equivalence relation on types, so it's known that there is a function normalize (maybe there are even many such functions) selecting a representative in every equivalence class but how to prefer a specific one (how to define or construct the output explicitly for any specific input)?

Regarding your struggle against implicit divergence. It's not necessary that you've selected the best possible approach. Sounds like you're doing some compiler work manually. How do other json libraries solve the issue? For example Circe? Besides by-name implicits => there is also shapeless.Lazy / shapeless.Strict (not equivalent to by-name implicits). If you have specific question about deriving type classes, overcoming implicit divergence maybe you should start a different question about that?

Regarding your approach with HashMap with Type keys. I'm still reminding that we're not supposed to rely on == for Types, correct comparison is =:=. So you should build your HashMap using =:= rather than ==. Search at SO for something like: hashmap custom equals.


Actually I guess your normalize sounds like you want some caching. You should have a type cache. Then if you asked to calculate normalize(typ) you should check whether in the cache there is already a t such that t =:= typ. If so you should return t, otherwise you should add typ to the cache and return typ.

This satisfies your requirement: A =:= B if and only if normalize(A) == normalize(B) (normalize(A).hashCode == normalize(B).hashCode should follow from normalize(A) == normalize(B)).


Regarding transformation of fooType into nextAppliedType try

def normalize(typ: Type): Type = typ match { 
  case TypeRef(pre, sym, args) =>
    internal.typeRef(internal.thisType(pre.typeSymbol), sym, args) 
}

Then normalize(fooType) == nextAppliedType should be true.

Sign up to request clarification or add additional context in comments.

9 Comments

Your answer is fabulous! I will try to understand the math part tomorrow. Regarding the relation between A, B, and the example's Type instances, I edited the question to clarify that. I knew about shapeless.Lazy but before copying it I wanted to try solving the recursive problem this other way; which has the advantage (if it works) that also improves compilation speed. Using a custom equals and hashCode does not work because the values in the map depend on the keys, and I suspect that equivalent but not equal Type instances don't satisfy Liskov principle for the typeChecker.
@Readren "Regarding the relation between A, B, and the example Type instances, I edited the question to clarify that." Ok, so fooType is A, and nextAppliedType is B. But what are outputs for normalize(fooType), normalize(nextAppliedType) or outputs on any different inputs? There is simple rule in programming. If you can't formulate what your algorithm (program, function) does then most probably you'll not be able to implement it. Sounds not so like you want to implement normalize as you want to change Type internal behavior.
@Readren Actually I guess your normalize sounds like you want some caching. You should have a type cache. Then if you asked to calculate normalize(typ) you should check whether in the cache there is already a t such that t =:= typ. If so you should return t, otherwise you should add typ to the cache and return typ.
I did what you suggest (wrapping the keys and reimplement the equals and hashCode) before asking the question. And the compiler changed the complain from divergence to type-check error. So, I suspected that the outcome of operations to equivalent but not equal Type instances are different for the type checker. Now I am beginning to discard that idea.
@Readren So, can normalize with caching work for you? (If you asked to calculate normalize(typ) you check whether in the cache there is already a t such that t =:= typ. If so you return t, otherwise you add typ to the cache and return typ.) I guess this satisfies your requirement: A =:= B if and only if normalize(A) == normalize(B) (normalize(A).hashCode == normalize(B).hashCode should follow from normalize(A) == normalize(B)).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.