Type Tailoring without Macros

The dream would be to build a flexible type system that can verify any dynamic program. But we are not quite there yet - mostly because of the non-disjoint nature of data in these programs. Dynamic languages allow types to overlap with one another (for example, in Python a value can be both a string and an iterable), making it nearly intractable for current type checkers to reason about every possible type relationship1. Until we can resolve these relationships in some reasonable amount of time, it might be worth exploring developer-assisted approaches to assist type checking with our current systems.

One such approach is type tailoring2. Type tailoring is a way to for languages with macro systems to introduce domain-specific typing rules. However, the same goal could be realized with modular type checkers with a plugin interface such as Mypy3 that allow developers to extend the type system with domain-specific constructs.

This piece introduces Mypy Plugins and covers some interesting use cases for it: (1) introducing inference support for the Lazy evaluation pattern, and (2) an idealistic idea to preprocess type annotations to safeguard values tagged with the dynamic type (any).

Lazy evaluation inference

The Lazy evaluation pattern is commonplace within the python community for delaying the instantiation of objects until they are needed. This approach optimizes memory by lazily creating objects only when they are required instead of eager initialization at module import time. Type information for this lazy object cannot be available at compile time which makes type checkers unable to verify programs that include such constructs. Language idioms such as this present challenges for gradual type checkers, often forcing developers to fall back to the Any type, which weakens overall soundness guarantees.

Consider this code sample of lazy object instantiation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from typing import Any, TypeVar, Generic, Callable, Self

empty = object()
Obj = TypeVar('Obj')

class Lazy(Generic[Obj]):
	def __init__(self: Self, f: Callable[[], Obj]) -> None:
		self._f = f
		self._inst = empty
	
	def __getattr__(self: Self, attr: str) -> Obj:
	if self._inst == empty:
		self._inst = self._f()
		
	return getattr(self._inst, attr)

class NewObj:
	def __init__(self: Self) -> None:
		print("finally created")
	
	def foo(self: Self) -> None:
		print("lookey here. foo() called")
	
def make_obj() -> NewObj:
	return NewObj()

lazy_inst = Lazy(make_obj)

lazy_inst.foo()

Mypy fails to recognize that foo() is a method of our lazily instantiated object. By running Mypy’s special function to display the inferred static type of an expression, reveal_type(lazy_inst.foo()), we see Mypy resorts to Any (the dynamic type) instead of capturing None as the return type.

1
2
3
4
5
reveal_type(lazy_inst) 
# note: Revealed type is "Lazy[NewObj]"

reveal_type(lazy_inst.foo()) 
# note: Revealed type is "Any"

This means we lose precision about our lazy object and further interactions with the object just propagate the Any type all up our type analysis.

This is where our Mypy plugin would come in handy. By implementing attribute hooks4, we can teach Mypy how to correctly type lazy objects. These hooks intercept attribute lookups during static analysis, allowing us to provide Mypy with the actual types that will be available at runtime. For our lazy object, we’ll create a plugin that tells Mypy that any attribute access on our lazy object should resolve to the types of the underlying NewObj instance.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from functools import partial
from mypy.errorcodes import ATTR_DEFINED
from mypy.plugin import Plugin, AttributeContext
from mypy.subtypes import find_member
from typing import Optional, Callable, Any
from mypy.types import Type, AnyType, Instance

def lazy_plugin_attribute_hook(ctx: AttributeContext, *, attr: str) -> Type:
	if not isinstance(ctx.default_attr_type, AnyType):
		return ctx.default_attr_type

	assert len(ctx.type.args) == 1, ctx.type
	gen_type = ctx.type.args[0]
	mem = find_member(attr, gen_type, gen_type)

	if mem:
		return mem
	else:
		ctx.api.fail(f"unknown attr accessed: {attr}", ctx.context,)
	return ctx.default_attr_type

class LazyPlugin(Plugin):
	def get_attribute_hook(self, fullname: str) -> Optional[Callable[[AttributeContext], Type]]:

	if fullname.startswith('projs.test_mypy.lazy.Lazy'):
		_, _, attrs = fullname.rpartition('.')
		return partial(lazy_plugin_attribute_hook, attr=attrs)

	return None

def plugin(version: str) -> type[LazyPlugin]:
	return LazyPlugin

When we call reveal_type again, Mypy correctly identifies lazy_inst.foo() as returning None, giving us proper type safety for our lazy instantiation pattern.

1
2
3
4
5
6
7
8
reveal_type(lazy_inst) 
# note: Revealed type is "Lazy[NewObj]"

reveal_type(lazy_inst.foo()) 
# note: Revealed type is "None"

reveal_type(lazy_inst.not_exists()) 
# note:  error: unknown attr accessed: not_exists [misc]

Preprocessing Type Hints

In the rest of this post, I attempt employing Mypy plugins to preprocess type annotations dynamically. For instance, consider a poorly typed function returning Any instead of a more specific type like str. We can implement plugin hooks that detect such patterns and nudge type inferencer to better reflect the function’s behaviour during type checking.

1
2
3
4
5
6
7
# poorly typed function
def bad(x: int) -> Any:
	return str(x)

# ideal (internal) modification 
def bad(x: int) -> str:
	return str(x)

To achieve this, FILL.


  1. Github Issues that highlight this problem. TypeScript, Luau and Mypy↩︎

  2. You can check out the OG paper about Type Tailoring with Macros ↩︎

  3. Mypy is a static type checker for Python that comes with a plugin system (see: plugin.py↩︎

  4. Attribute Hooks are a mechanism in the Plugin API that allow you to customize how Mypy handles attribute accesses within your code. Some other mechanism include function and method hooks, class hooks, etc. (see more). ↩︎