Better static analysis with entity type storages in phpstan-drupal 1.10

I am happy to announce the 1.1.0 release of phpstan-drupal! This is a minor version bump due to a breaking change in the configuration options for phpstan-drupal. Before we dive in, I want to give major thanks to brambaud for their outstanding contributions this month. Their work has brought excitement to the table for untangling Drupal’s magical bits for static analysis. I also want to thank eiriksm for his work to fix some of Drupal’s magic when fetching field value properties.

Here is a summary of the significant improvements:

  • On analysis level 2, PHPStan would consider $entity->get('my_field')->value invalid but $entity->get('my_field')->first()->value valid. The former is allowed due field item list classes defaulting to the first value when using the magically __get call.
  • When an entity query is executed, the return type may be an int or array of entities, depending on if the count method was called before execution, creating a count query. PHPStan read the return type from the methods docblock, and that was it. Now, it will properly return int, array<int, string> for content entities, and array<string, string> for configuration entities.
  • Ensure that entity storage methods return the appropriate entity class if it has been defined in the configuration. Before this release, phpstan-drupal only allowed configuring the entity storage class to improve static analysis. With 1.1.0, the entity class can also be defined. If a storage class is not provided, but the entity class is, it will infer the appropriate default storage based on if it is a content or configuration entity.

All of these features are provided by implementing dynamic return type extensions. A dynamic return type extension is used to help PHPStan understand what will be returned to a method based on arguments passed to it. Take getting an entity type’s storage from the entity type manage:

\Drupal::entityTypeManager()->getStorage('node');
\Drupal::entityTypeManager()->getStorage('user');
\Drupal::entityTypeManager()->getStorage('block');

Each of these entity types has its own storage class. And these storage classes return specific entity type classes, not just the generic EntityInterface as our code is documented.

Configuring entity mapping

parameters:
drupal:
entityTypeStorageMapping:
node: Drupal\node\NodeStorage
taxonomy_term: Drupal\taxonomy\TermStorage
user: Drupal\user\UserStorage

This has now been moved to entityMapping so that we can easily add more information to improve the static analysis of Drupal's entity code.

parameters:
drupal:
entityMapping:
node:
class: Drupal\node\Entity\Node
storage: Drupal\node\NodeStorage
taxonomy_term:
class: Drupal\taxonomy\Entity\Term
storage: Drupal\taxonomy\TermStorage
user:
class: Drupal\user\Entity\User
storage: Drupal\user\UserStorage
block:
class: Drupal\block\Entity\Block

I am not sure how many folks this will impact, since many users are purely using phpstan-drupal for deprecation checks. But, I am getting very excited that more are beginning to use it for actual static analysis.

How does the entity type storage analysis work?

The following entity types have been configured: node, taxonomy_term, user, and block.

assertType('Drupal\node\NodeStorage', $etm->getStorage('node'));
assertType('Drupal\user\UserStorage', $etm->getStorage('user'));
assertType('Drupal\taxonomy\TermStorage', $etm->getStorage('taxonomy_term'));
assertType('Drupal\Core\Entity\EntityStorageInterface', $etm->getStorage('search_api_index'));
assertType('Drupal\Core\Config\Entity\ConfigEntityStorage', $etm->getStorage('block'));

The EntityTypeManagerGetStorageDynamicReturnTypeExtension return type extension listens for getStorage method calls. It then determines the appropriate class to be returned as an ObjectType in PHPStan's Type System. However, just having the class reflection isn't enough. We extended the ObjectType to represent generic entity storage, content entity storage, or config entity storage. This allows us to perform better analysis when a query is performed.

Let’s look at the next bit of improvements: determining the return type from an entity query!

assertType(
'array<string, string>',
\Drupal::entityTypeManager()->getStorage('block')->getQuery()
->execute()
);
assertType(
'array<int, string>',
\Drupal::entityTypeManager()->getStorage('node')->getQuery()
->accessCheck(TRUE)
->execute()
);
assertType(
'int',
\Drupal::entityTypeManager()->getStorage('node')->getQuery()
->accessCheck(TRUE)
->count()
->execute()
);

Let’s break that down real quick.

  • The execute method has a documented return type of int | array (source). By default, it returns an array of entity IDs, otherwise an integer for the count of entities when it is a county entity.
  • Content entities have serial identifiers (auto-incrementing). The result is an array keyed by the entity ID or revision ID and a value of its entity ID. As you may notice, the node query return type is array<int, string>. The keys are converted to integers by PHP, as their array keys. But! Numbers are not automatically cast to integers when retrieved from the database. So the IDs will be integer strings.
  • Configuration entities have string identifiers. The entity query will always be an array of configuration entity IDs, keyed by their ID as well. That is why it has the array<string, string> return type.

When brambaud delivered this pull request I was blown away. I don’t know if you’re excited as me, yet. But, if you’re not, I hope this next improvement does!

Entity storages have multiple methods for loading entities, they are:

  • create – creates a new unsaved entity object.
  • load – loads an entity by its identifier
  • loadUnchanged – loads the entity by identifier directly from storage, bypassing any static cache
  • loadMultiple – loads multiple entities by their identifiers
  • loadByProperties – loads multiple entities based on properties, a shortcut for an entity query.

Now, let’s look at the type assertions. Without the new EntityStorageDynamicReturnTypeExtension return type extension, a generic entity interface is returned.

Here are the assertion types for node storage methods.

$nodeStorage = \Drupal::entityTypeManager()->getStorage('node');
assertType('Drupal\node\Entity\Node', $nodeStorage->create(['type' => 'page', 'title' => 'foo']));
assertType('Drupal\node\Entity\Node|null', $nodeStorage->load(42));
assertType('Drupal\node\Entity\Node|null', $nodeStorage->loadUnchanged('42'));
assertType('array<int, Drupal\node\Entity\Node>', $nodeStorage->loadMultiple([42, 29]));
assertType('array<int, Drupal\node\Entity\Node>', $nodeStorage->loadMultiple(NULL));
assertType('array<int, Drupal\node\Entity\Node>', $nodeStorage->loadByProperties([]));

As you can see, the Drupal\node\Entity\Node class is properly returned!

But, if an entity type is encountered that has not been mapped, the defaults inferred from the docblock are used.

$storage = \Drupal::entityTypeManager()->getStorage('unknown_entity_type_id');
assertType('Drupal\Core\Entity\EntityInterface', $storage->create(['name' => 'foo']));
assertType('Drupal\Core\Entity\EntityInterface|null', $storage->load(42)););
assertType('Drupal\Core\Entity\EntityInterface|null', $storage->loadUnchanged(42));
assertType('array<Drupal\Core\Entity\EntityInterface>', $storage->loadMultiple([42, 29]));
assertType('array<Drupal\Core\Entity\EntityInterface>', $storage->loadMultiple(NULL));
assertType('array<Drupal\Core\Entity\EntityInterface>', $storage->loadByProperties([]));

What would you like to see?

If you have any questions about this release, see the GitHub discussion for the release: https://github.com/mglaman/phpstan-drupal/discussions/253