
Hosted by Derick Rethans · EN

PHP Internals News: Episode 103: Disjunctive Normal Form (DNF) Types Friday, June 24th 2022, 09:07 BST London, UK In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about the "Disjunctive Normal Form Types" RFC that he has proposed with Larry Garfield. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:15 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 103. Today I'm talking with George Peter Banyard again, this time about a disjunctive normal form types RFC, or DNF, for short, which he's proposing together with Larry Garfield. George Peter, would you please introduce yourself? George Peter Banyard 0:39 Hello, my name is George Peter Banyard, I work on PHP paid part time, by the PHP foundation. Derick Rethans 0:44 Just like last time, we are still got colleagues. George Peter Banyard 0:46 Yes, we are indeed still call it. Derick Rethans 0:48 What is this RFC about? What is it trying to solve? George Peter Banyard 0:52 The problems of this RFC is to be able to mix intersection and union types together. Last year, when intersection types were added to PHP, they were explicitly disallowed to be used with Union types. Because: a) mental framework, b) implementation complexity, because intersection types were already complicated on their own, to try to get them to work with Union types was kind of a big step. So it was done in chunks. And this is the second part of the chunk, being able to use it with Union types in a specific way. Derick Rethans 1:25 What is the specific way? George Peter Banyard 1:27 The specific way is where the disjoint normal form thing comes into play. So the joint normal form just means it's a normalized form of the type, where it's unions of intersections. The reason for that it helps the engine be able to like handle all of the various parts it needs to do, because at one point, it would need to normalize the type anyway. And we currently is just forced on to the developer because it makes the implementation easier. And probably also the source code, it's easier to read. Derick Rethans 1:54 When you say, forcing it up on a developer to check out you basically mean that PHP won't try to normalize any types, but instead throws a compilation error? George Peter Banyard 2:05 Exactly. It's, it's the job of the developer to do the normalization step. The normalization step is pretty easy, because I don't expect people to do too many stuff as intersection types. But as can always be done as a future scope of like adding a normalization step, then you get into the issues of like, maybe not having deterministic code, because normalization steps can take very, very long, and you can't necessarily prove that it will terminate, which is not a great situation to be in. Imagine just having PHP not running at all, because it's stuck in an infinite loop trying to normalize the format. It's just like, oh, I can't compile Derick Rethans 2:39 Would a potential type alias kind of syntax help with that? George Peter Banyard 2:44 Maybe, I'm not really sure. Actually reading like research about it from computer scientists, in functional programming languages, which is everything is compiled on my head. And they have the whole thing was like, well, they need to type type normalize, and especially with type aliases, they haven't really figured out a way yet. So I'm not sure how we are going to figure out a way if experts and PhD students and researchers haven't really figured out a way. Derick Rethans 3:08 And is the reason for that mostly, because PHP, resolves types while it is running code sometimes because it has to overload classes, and then it might find out it is an inherited class, for example? George Peter Banyard 3:19 Yes, I think it's like this weird thing where might maybe PHP has like kind of an advantage, because it doesn't need to, like resolve all of the types at once. And if you have a type alias, it's just oh, if it's used, and you just need to resolve it, and then try to figure it out. There's also the added complexity of like, variance checks, because most functional programming languages, they have variance to some degree, but they don't have the whole inheritance of like typical OOP languages have. It's kind of a very strange field, the fact that yeah, PHP is just like, well, we kind of do stuff at runtime, and you don't necessarily need everything. And it just works is like, well, we'll do. That's mainly the reason why the dev needs to do the normalization step, the form is done. It's also I think, the most easiest to understand, it's just like, Oh, you have this and this, or this group, or stuff, or this group of stuff, or this thing, simple type. The other form would be another normalized form would be conjunctive normal form, which is a list of ANDs of ORs to just have this thing, or X, like (A or B or C) and X and (Y or Z), which I think is harder to understand. Derick Rethans 4:26 What is the exact syntax then? George Peter Banyard 4:28 So the exact syntax is, if you want to have an intersection type was in a union type, you need to like bracket it by parentheses. And then you have like the normal pipe union operator and you can mix it with single types, you can mix it with true, you can mix it with false, which are literal types, which now exist, or just normal, bool types. Derick Rethans 4:48 The parenthesis is actually required. You don't rely on operator precedence to make things work? George Peter Banyard 4:53 Yes. Relying on operator precedence is terrible. Derick Rethans 4:57 Yep, I agree. George Peter Banyard 4:58 I'd say Oh, yeah, but I think I've heard this argument on the list like a couple of times, it's just, oh, yeah, but maths, like, has like, and as priority over like, or, I mean, I did three years of a maths degree and not gonna lie. Maths notation is terrible for most of us. People don't even agree on terminology. I'm just gonna say, let's, let's just do better. Derick Rethans 5:19 I agree. I mean, most coding standards for any sort of variable for like conditions, will already require parenthesis around multiple complex clauses anyway, right? I mean, it's a sensible thing to do, just for readability, in my opinion. So the RFC also talks about a few syntax that you aren't allowed to do, and that you have to normalize or deconstruct yourself, what kinds of things are these? George Peter Banyard 5:41 if you would want to have a type which has an intersection of a class A with at least one other class, so let's say X or Y, but you can always convert it into DNF form, how this type would be, it would be (A and X) or (A and Y). This seems to be the more unusual case, I would imagine. One of the motivating cases of DNF types is to do something like Array or (Traversable and Countable). I don't really see mixing and match...

PHP Internals News: Episode 102: Add True Type Thursday, June 2nd 2022, 09:06 BST London, UK In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about the "Add True Type" RFC that he has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:00 Hi I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is episode 102. Today I'm talking with George Peter Banyard about the Add True Type RFC that he's proposing. Hello George Peter, would you please introduce yourself? George Peter Banyard 0:33 Hello, my name is George Peter Banyard, I work part time for the PHP Foundation. And I work on the documentation. Derick Rethans 0:40 Very well. We're co workers really aren't we? George Peter Banyard 0:43 Yes, indeed, we all co workers. Derick Rethans 0:45 Excellent. We spoke in the past about related RFCs. I remember, which one was that again? George Peter Banyard 0:51 Making null and false stand alone types Derick Rethans 0:53 That's the one I was thinking of him. But what is this RFC about? George Peter Banyard 0:56 So this RFC is about adding true as a single type. So we have false, which is one part of the Boolean type, but we don't have true. Now the reasons for that are a bit like historical in some sense, although it's only from PHP 8.0. So talking about something historical. When it's only a year ago, it's a bit weird. The main reason was that like PHP has many internal functions, which return false on failure. So that was a reason to include it in the Union types RFC, so that we could probably document these types because I know it would be like, string and Boolean when it could only return false and never true. So which is a bit pointless and misleading, so that was the point of adding false. And this statement didn't apply to true for the most part. With PHP 8, we did a lot of warning to value error promotions, or type error promotions, and a lot of cases where a lot of functions which used to return false, stopped returning false, and they would throw an exception instead. These functions now always return true, but we can't type them as true because we don't have it, and have so they are typed as bool, which is kind of also misleading in the same sense, with the union type is like, well, it only returns false. So no point using the boolean, but these functions always return true. But if you look at the type signature, you can see like, well, I need to cater to the case where the returns true and when returns false. Derick Rethans 2:19 Do they return true or throw an exception? George Peter Banyard 2:22 Yeah, so they either return true, or they either throw an exception. If you would design these functions from scratch, you would make them void, but legacy... and we did, I know it was like PHP 8.0, we did change a couple of functions from true to void. But then you get into these weird shenanigans where like, if you use the return value of the function in a in an if statement, null gets because in PHP, any function does return a value, even a void function, which returns null. Null gets coerced to false. So you now get like, basically a BC break, which you can't really? Yeah, we did a bit and then probably we sort of, it's probably a bad idea. That's also the point of like, making choices, things that are static analysers can be like, more informants being like, Okay, your if statement is kind of pointless here. Derick Rethans 3:06 Yeah, you don't want to end up breaking BC. Now, we already had false and bool, you're adding true to this. How does that work with Union types? Can you make a union type of true or false? George Peter Banyard 3:18 No. So there are two reasons mainly. A. true and false is the same as like boolean, which is like just use Boolean in this case. But you can say, well, it's more specific, so just allow it. So that's would be reasonable. But the problem is, false has different semantics than boolean. False does not coerce values. So it only accepts false as a literal value. Whereas boolean, if you're not in strict type, which is a lot of code, it will cause values like zero to false one, or any other integers to true. It will coerce every other integer to true, like the true type follows the behaviour of false of being a value type. So it only accepts true, you would get into this weird distinction of does true or false, mean exactly true or false? Or do you get the same behaviour as using the boolean type? Derick Rethans 4:07 So I would say that true or false would than be more restrictive than bool. George Peter Banyard 4:12 Exactly, which is a bit of a problem, because PHP internally has true and false and separate types, which also makes the implementation of this RFC extremely easy, because PHP already makes the distinction of them. But at the same time, the boolean type is just a union of the bitmask of true and false. You can't really distinguish between the types, true or false, or the boolean type within the type system. Currently just does it by checking if it only has one then it can do like two checks. Specifically, you would need to add like an extra flag. I mean, it's doable, but it's just like, Well, who knows which semantics we want? Therefore, just leave it for future discussion because I'm not very keen on it to be fair. Derick Rethans 4:55 True or false are really only useful for return values and not so much for arguments types, because if you have an argument that that always must be true, then it's kind of pointless to have of course. George Peter Banyard 5:05 Same as like it was with the null type RFC. Although there might be one case where PHP internal functions might change the value to true for an argument, I can maybe two types, would be like with the define function, this thing being like case insensitive or case sensitive, I don't remember what the parameter actually; could actually either be false or true, because at the moment, I think emits a notice, things do like the this thing is not supported, therefore the values what was ignored. But we could conceivably see that in PHP 9, we would actually implement this as a proper like: Okay, this only accepts true, yes, this argument is pointless, but it's in the middle of the function signature, so you can't really move it. The spl_register_overload function has like as its second argument, the throw on error or not, which since PHP 8 only accepts true, but it's in the middle of the function. The last argument is still very useful. It's prepend, instead of append the autoloader, I think, or might be the other way around, check the docs. Since PHP 8, this only accepts true. So if you pass in false, it will emit a notice and saying you'd like this argument has just been ignored. So whatever. But we can't really remove the argument. Because well, it's, if you use the third argument, as with positional arguments, then you would change like the signature and you would break it. Now, we don't have a way to enforce in PHP to use named arguments, because that would be a solution. It's just like, well, if you want to set this argument, you need to use named arguments, but we can't do that. Otherwise, then creating a new function, which has an alias, which is also kind of terrible. That would be one of the maybe only cases where you would actually get like true as a as an argument ...

PHP Internals News: Episode 101: More Partially Supported Callable Deprecations Thursday, May 19th 2022, 09:05 BST London, UK In this episode of "PHP Internals News" I talk with Juliette Reinders Folmer (Website, Twitter, GitHub) about the "More Partially Supported Callable Deprecations" RFC that she has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is episode 101. Today I'm talking with Juliette Reinders Folmer, about the Expand Deprecation Notice Scope for Partially supported Callables RFC that she's proposing. That's quite a mouthful. I think you should shorten the title. Juliette, would you please introduce yourself? Juliette Reinders Folmer 0:37 You're starting with the hardest questions, because introducing myself is something I never know how to do. So let's just say I'm a PHP developer and I work in open source, nearly all the time. Derick Rethans 0:50 Mostly related to WordPress as far as I understand? Juliette Reinders Folmer 0:52 Nope, mostly related to actually CLI tools. Things like PHP Unit polyfills. Things like PHP Code Sniffer, PHP parallel Lint. I spend the majority of my time on CLI tools, and only a small portion of my time consulting on the things for WordPress, like keeping that cross version compatible. Derick Rethans 1:12 All right, very well. I actually did not know that. So I learned something new already. Juliette Reinders Folmer 1:16 Yeah, but it's nice. You give me the chance now to correct that image. Because I notice a lot of people see me in within the PHP world as the voice of WordPress and vice versa, by the way in WordPress world to see me as far as PHP. And in reality, I do completely different things. There is a perception bias there somewhere and which has slipped in. Derick Rethans 1:38 It's good to clear that up then. Juliette Reinders Folmer 1:39 Yeah, thank you. Derick Rethans 1:40 Let's have a chat about the RFC itself then. What is the problem that is RFC is trying to solve? Juliette Reinders Folmer 1:46 There was an RFC or 8.2 which has already been approved in October, which deprecates partially supported callables. Now for those people listening who do not know enough about that RFC, partially supported callables are callables which you can call via a function like call_user_func that which you can't assign to variable and then call as a variable. Sometimes you can call them just by using the syntax which you used for defining the callable, so not as variable but as the actual literal. Derick Rethans 2:20 And as an example here, that is, for example, static colon colon function name, for example. Juliette Reinders Folmer 2:26 Absolutely, yeah. Derick Rethans 2:27 Which you can use with call_user_func by having two array elements. You can call it with literal syntax, but you can't assign it to a variable and then call it. Do I get that, right? Juliette Reinders Folmer 2:36 Absolutely. That's it. There's eight of those. And basically, the original RFC from Nikita proposed to deprecate support for them in 8.2, add deprecation notices and remove support for them altogether in PHP nine. And the original RFC explicitly excluded two particular things from those deprecation notices. That's the callable type and using the syntaxes in combination with the is_callable function, where you're checking if the syntax is callable. The argument used in the original RFC was to keep those side effect free. The problem with this is that with the callable type, this means you go from absolutely no notice or nothing, to a fatal error in PHP 9. Everything works, and you're not getting any notification. But in PHP 9, its fatal error at the moment that callable is being passed to a function. Derick Rethans 3:31 This is the callable type in function declarations. Juliette Reinders Folmer 3:33 Yeah, absolutely. And with is_callable, I discovered a pattern in my wanderings across the world where people use the syntax in is_callable, but then use it in a literal call. So not using call_user_func, not using a variable to call it, but it's callable static double colon method name, and then called static double colon method name as literal. And that pattern basically, for valid calls would mean that that function would no longer be called in PHP 9 without any notification whatsoever. Derick Rethans 4:13 So it's a silent change that you can't detect at all. Juliette Reinders Folmer 4:17 Yeah, which to me sounded dangerous. I started asking some questions about that. But six weeks ago, the conclusion was, well, maybe this should be changed. But as this was explicit in the original RFC, we can't just change it. We need to have a new RFC to basically amend the original RFC and remove the exception for these two situations and allow them to throw deprecation notices. Derick Rethans 4:44 What are you proposing to change with this RFC than? Juliette Reinders Folmer 4:47 What this RFC is proposing is simply to remove the exception that the callable type and is_callable are not throwing a deprecation notice. This RFC is proposing that they should throw a deprecation notice, so that more of these type situations can be discovered in time for PHP 9 to prevent users getting fatal errors. Derick Rethans 5:08 Now, of course, we have no idea when PHP nine is actually showing up, but I don't think it will be this year. Well, I know it won't be this year, and it certainly won't be be next year neither, I think. Juliette Reinders Folmer 5:17 That's all the same. I mean, it makes there'll be two, three years ahead, but it doesn't really make sense to have the main deprecation in 8.2 and then have the additional deprecation in 8.4 or something. Derick Rethans 5:29 Absolutely. Juliette Reinders Folmer 5:30 It's a lot more logical to have it all in in the same version. Because it's all related. It's basically the same thing without the exception for callable type. And is_callable. Derick Rethans 5:42 Although there is no current application, would this be able to be found if you had like a comprehensive test suite? Juliette Reinders Folmer 5:48 Yes and no. Yes, you can find this with a test suite. But one, you're presuming that there are tests. Two, that the tests covered the effected code with enough path coverage. Three, imagine a test you've written yourself at some point in the past where which affected callables, you might have, you know, a data provider where you say: Okay, valid callable function, which you've mocked or, you know, closure, wh...

PHP Internals News: Episode 100: Sealed Classes Thursday, March 24th 2022, 09:04 GMT London, UK In this episode of "PHP Internals News" I talk with Saif Eddin Gmati (Website, Twitter, GitHub) about the "Sealed Classes" RFC that he has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is episode 100. Today I'm talking with Saif Eddin Gmati about the sealed classes RFC that they're proposing. Saif, would you please introduce yourself? Saif Eddin Gmati 0:31 Hello, my name is Saif Eddin Gmati. I work as a Senior programmer at Les-Tilleuls.coop. I'm an open source enthusiast and contributor. Derick Rethans 0:39 Let's dive straight into this RFC. What is the problem that you're trying to solve with it? Saif Eddin Gmati 0:43 Sealed classes just like enums and tagged unions allow developers to define their data models in a way where invalid state becomes less likely. It also eliminates the need to handle unknown subtypes for a specific model, as using sealed classes to define models gives us an idea on what child types would be available at run time. Sealing also provides us with a way for restricting inheritance or the use of a specific trait. For example, if we look at logger trait from the PSR log package that could be sealed to logger interface. This way, we ensure that every use of this trait is coming from a logger not from any other class. Derick Rethans 1:24 I'm just reading through this RFC tomorrow, again, and something I didn't pick up on reading to it last time. It states that PHP already has sort of two sealed classes. Unknown Speaker 1:35 Yes, the throwable class in PHP can only be implemented by extending either error or exception. The same applies for DateTime interface, which can only be implemented by extending DateTime class or DateTime Immutable class. Derick Rethans 1:52 Because PHP itself doesn't allow you to implement either throwable or DateTimeInterface. I haven't quite realized that that these are also sealed classes really. What is sort of the motivation behind wanting to introduce sealed classes? Unknown Speaker 2:06 The main motivation for this feature comes from Hack the programming language. Hack contains a lot of interesting type concepts that I think personally, PHP could benefit from and sealed classes is one of those concepts. Derick Rethans 2:18 What kind of syntax are you proposing? Saif Eddin Gmati 2:21 The syntax I'm proposing actually there is three syntax options for the RFC currently, but the main syntax is inspired by both Hack and Java. It's more similar to the syntax used in Java as Hack uses attributes. Personally, I have been I guess, using attributes from the start as I personally see sealing and finalizing similar as both effects how inheritance work for a specific class. Having sealed implemented as an attribute while final uses a keyword brings more inconsistency into the language which is why I have decided not to include attributes as a syntax option. Derick Rethans 2:56 In my opinion, attributes shouldn't be used for any kind of syntax things. What they should be used for is attaching information to already existing things. And by using attributes again, to extend syntax, you sort of putting this syntax parsing in two different places , right? You're putting it both in the syntax as well as in attributes. I asked what the syntax is, but I don't think he actually mentioned what the syntax is. Saif Eddin Gmati 3:20 The syntax the main set next proposed for the RFC is using sealed and permit as keywords we first have the sealed modifier which is added in front of the class similar to how final or abstract modifiers are used. We also have the permit clause which is basically a list allows you to name a specific classes that are able to inherit from this specific type. Derick Rethans 3:43 So when you say type here, is that just interfaces and classes or something else as well? Saif Eddin Gmati 3:48 It's classes interfaces and traits. Traits are allowed to add sealing but they are not allowed to permit. Okay for example, an interface is not allowed to permit a trait because a trait cannot implement an interface Derick Rethans 4:03 In the language itself, when does this get enforced? Saif Eddin Gmati 4:06 This inheritance restriction gets enforced when loading a class. So let's say we are loading Class A currently if this class extends B, we check if B is sealed. And if it is we check if B allows A to extend it. But when loading a specific sealed class, nothing gets actually checked. We just take the permit clause classes and store them and move on. Derick Rethans 4:32 It only gets checks if you're trying to implement an interface. Saif Eddin Gmati 4:36 This gets enforced when trying to implement an interface, extend that class, or use it trait. Derick Rethans 4:41 Okay. What are general use cases for this feature? Saif Eddin Gmati 4:45 General use cases for a feature are for example, implementing programming concepts such as Option which is a type that can only have two subtypes. One is Some, other is None. Another concept is the Result where only two subtypes are possible, either success or failure. Another use case is to restrict inheritance. As I mentioned before, for example, logger trait from the PSR log package is a trait that implements some of the method methods in logger interface, and expects whoever is using that trait to implement the rest. However, there is no restriction by the language regarding this, we can seal this trait to a logger interface ensuring that only loggers are allowed use this trait. Derick Rethans 5:34 When you say that Option has like the value Some or None, just sound like an enum to me. How should I think differently about enums and sealed classes here? Saif Eddin Gmati 5:43 Enums cannot hold a dynamic value. You can have a value but you cannot have a dynamic value, however, tagged unions will allow you to implement option the same way. Tagged unions are that useful only for this specific case, there is some other cases such as the one I mentioned for traits that cannot actually be implemented using the tagged unions. There is also the I don't know how to say this. Let's say we have a type A that sealed and permitting only B and C. And this case A on itself, as long as it's not an abstract class, is by itself a type. Can be used as a normal class, you can create an instance and use it normally. However with tagged unions, the option itself would not be a type, you either have some or none. That's the main difference between tagged unions until classes Derick Rethans 6:37 A tagged union PHP doesn't have them. So how does a tagged union relate to enums? Saif Eddin Gmati 6:43 ...

PHP Internals News: Episode 99: Allow Null and False as Standalone Types Thursday, March 10th 2022, 09:04 GMT London, UK In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about the "Allow Null and False as Standalone Types" RFC that he has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:15 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explain the latest developments in the PHP language. This is episode 99. Today I'm talking with George Peter Banyard, about the Allow null and false at standalone types RFC that he's proposing. Hello, George Peter, would you please introduce yourself? George Peter Banyard 0:36 Hello, my name is George Peter Banyard. I work on the PHP language, and I'm an Imperial student in maths in my free time. Derick Rethans 0:44 Are you're trying to say you're a student in your free time or contribute to PHP in your free time? George Peter Banyard 0:49 I feel like at this time, it's like, both are true at the same time. Derick Rethans 0:53 Let's hop into this RFC. It is titled allow null and false as standalone types. What is the problem that it is trying to solve? George Peter Banyard 1:02 This is the second iteration of this RFC. So the first one was to just allow null initially, and null is the unit type In type theory parlance of PHP, ie the type which only has one value. So null is a type and a value. And the main issue is that when for leads more with like inhabitants, and like the Liskov substitution principle. If you have like a method, like the parent method, which can be told like either null or an object, and your implementation in a child class always returns null, for various reasons, maybe because it doesn't support this feature, or whatever is out, or... If your child method only returns null, currently, you can't document, that you can't type this properly, you can document it in a doc comment or something like that. But due to how PHP type handling works, you need to specify at least like another type with null in the union. Basically resort to always saying like mimicking the parent signature, when you could be more specific. This was the main use case I initially went into. Derick Rethans 2:08 If I understand correctly, you can't just have an inherited method that has hinted as to just return null? George Peter Banyard 2:14 Exactly. If you always return null, maybe because you always work or something like that, then you must still declare the return type as like null or exception, which is not a concrete because you say what, like why never fail. And like static analysers, if they can figure it out that you're using a child class, they can't maybe like do some assumptions or work further down that like what you're doing is redundant or things like that. So that's one of the main reasons I initially went with it. And I didn't add false initially, because it was like, well, false, it's not really a type properly. It's, it's what's called a value type. False is one value from the Boolean type. And I was like, Well, okay, we're just going to limit it to like, being the type theory purist, limited to proper types, where null is a proper type, although it's a bit sometimes misunderstood, I feel in the PHP community at large. And then people were like, well, if we add null, then by the only type-ish thing, which you can use in a type declaration, or whatever, which can't be used in a return type on its own, is false. And it's just weird. So why not add it in full. So that was the second thing as to why I added it. Some of PHP internal's functions being terribly designed because they were designed back in the early noughties, return null on success and false on failure, which you can't probably type at the moment. Currently, we need to type them as like Boolean or null, but true can never be returned in this case. And there are some other some other people have reached out to me it's like, well, yeah, but I always return false in this case. Or I also return always true in this case, although true, we have this weird asymmetry that we have false as a value type and not true. Derick Rethans 3:49 What was the reason for having false but not true? George Peter Banyard 3:53 When the union type RFC got discussed and passed for PHP 8.0, false was added, because a lot of traditional behaviour of PHP internal functions, was to return false on failure, instead of the technically more correct thing would be to return null. Because loads of functions return a false on failure, and saying that like in returns, these types, or a Boolean would be basically lying because you could never have true, false was included in it. With the restrictions that you can only use false as the complement with other types. So you need to do for example, array, or false, you couldn't just use false. Derick Rethans 4:37 Would it also mean that you can define a return type of a method that inherited a method that returns a bool, as false? George Peter Banyard 4:48 Yes, that would be now possible with the amended proposal. Yeah, which goes back to this weird a symmetry, we're probably. Adding true to make a complete would be a future RFC to do. Derick Rethans 5:00 Now, we've talked about return types. But I guess the reverse applies to arguments? George Peter Banyard 5:06 Arguments and property types also would, would be allowed to, like declare themselves as like null or false. The usefulness here is way more limited. Because if you declare an argument to be of type null, then basically you can only ever pass a null to it. And then therefore, the type doesn't do anything. Derick Rethans 5:26 But in an inherited method, you could then widen it. George Peter Banyard 5:31 Yes, exactly. You could always say: Well, this argument exists, it's always null. If you extend like your class or message, then you can add other types. But in theory, you can already do that by adding like an argument at the end of the message, because that's LSP compliant. The case for, and properties of those, because they are typing, they're in like their beads. Kind of debatable why you would do that. But it's just that like, well, if you accept types at one point, just restricting them like somewhere else gets very weird. At this point is more like look at the human review, or like use static analysis for the analyser to tell you like this argument is redundant and just remove it or this property doesn't make any sense. Because if it can only ever be null, why does it even exist in the first place? Derick Rethans 6:13 Right now, you can already use false in union types, but why not with null or false? George Peter Banyard 6:19 That goes back to the when a union type RFC got introduced. Null got added as a keyword. Before you could only use the question mark, before a type to make the type nullable. If you have a more complex union type, to not use the question mark in front of it. Therefore, the null keyword got added as a proper type. And because the logic was, Well, you shouldn't ever be able to return just null. Because then that function is kind of equivalent to void. Because of that, it was said that like, Well, okay, null and false basically have like kind of the same s...

PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode Thursday, March 3rd 2022, 09:02 GMT London, UK In this episode of "PHP Internals News" I chat with Rowan Tommins (GitHub, Website, Twitter) about the "Deprecate and Remove utf8_encode and utf8_decode" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP Internals News, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 98. Today I'm talking with Rowan Tommins about the "Deprecate and remove UTF8_encode and UTF8_decode" RFC that he's proposing. Hi, Rowan, would you please introduce yourself? Rowan Tommins 0:38 Hi, I'm Rowan Tommins. I'm a PHP software architect by day and try and contribute back to the community and have been hanging around in the internals mailing list for about 10 years and contributed to make the language better, where I can. Derick Rethans 0:57 Excellent. Yeah, that's how I started out as well, many, many more years before that, to be honest. This RFC, what problem is this trying to solve? Rowan Tommins 1:08 PHP has these two functions, utf8_encode and utf8_decode, which, in themselves, they're not broken. They do what they are designed to do. But they are very frequently misunderstood. Mostly because of their name. And because Character Encodings in general, are not very well understood. People use them wrong, and end up getting in all sorts of pickles that are worse than if the functions weren't there in first place. Derick Rethans 1:37 What are you proposing with the RFC then? Rowan Tommins 1:39 Fundamentally, I'm proposing to remove the functions. As of PHP 8.2, there will be a deprecation notice whenever you use them, and then in 9.0, they would be gone forever, and you wouldn't be able to use them by mistake, because they just wouldn't be there. Derick Rethans 1:56 I reckon there's going to be a way to actually do what people originally intended to do with it at some point, right? Rowan Tommins 2:02 So yeah, there are alternatives to these functions, which are much clearer in what you're doing, and much more flexible in what you can do with them so that they cover the cases that these functions sound like they're going to do, but don't actually do when you understand what they're really doing. Derick Rethans 2:20 I think we'll get back to that a little bit later on. You're wanting to deprecate these functions. But what do these functions actually do? Rowan Tommins 2:27 What they actually do is convert between a character encoding called Latin-1, ISO 8859-1, and UTF-8. So utf8_encode converts from Latin-1 into UTF-8, utf8_decode does the opposite. And that's all they do. Their names make it sound like they're some kind of fix all the UTF 8 things in my text. But they are actually just these one very specific conversion, which is occasionally useful, but not clear from their names. Derick Rethans 3:01 It's certainly how I have seen it used in the past, where people just throw everything and the kitchen sink at it, and expecting it to be valid UTF 8, and then at the end, decode. I mean, the decoding was not even part much of this, right? It's just throw everything at it, and then magically it will all be UTF 8. But I reckon that's not really quite the case. When and how does that go wrong? Rowan Tommins 3:26 So what actually ends up happening is, because text doesn't know what encoding it's in. Something that people misunderstand about character encoding is they think it's like, the text is a certain colour, and the computer knows what colour it is. And if you tell the computer to make it a different colour, then it will work. But it's not like that. In the computer, there's just the sequence of binary. And the encoding is how to read that binary as text. And if you tell the computer to read it as Latin 1, it will read it as Latin 1. If you take to convert from Latin 1 to UTF 8, it will assume the input is Latin 1, it will convert to UTF 8 on that basis. If your text actually wasn't Latin 1 in the first place, you're just going to end up with garbage. And some of the worst cases of that is when you already have UTF 8, and then you run utf8_encode on it, because the language doesn't know that you've already got UTF 8, so it tries to read its Latin 1, write it out ass UTF 8 and you get this weird Mojibake. I don't know pronouncing that right. Derick Rethans 4:27 I think it's pronounced Mojibake. Rowan Tommins 4:30 Mojibake. Derick Rethans 4:31 It's a Japanese term, because clearly these things, these issues happened with Japanese text quite a lot because they have a lot more different and difficult characters and encodings as well. With which things often go wrong though? Rowan Tommins 4:44 Using an unco on text that's already UTF 8 is obviously a big one. Usually obvious, but occasionally people just getting a muddle with that. The other thing that often happens is confusing with similar encoding. Latin 1 is often mistaken for a different coding windows 1252. To the extent that web pages labelled as Latin 1, web browsers will assume that they're actually in Windows 1252. These PHP functions don't make that assumption. If your text is actually in Windows 1252, and it's been mislabelled Latin 1, you might still think you're doing the right thing. So I've got Latin 1 text, but you haven't. And then the characters that are different, are going to get mangled again. And there's a few other related encodings that often look the same. There are a few other encodings that look the same at a glance that again, will go wrong on any character that's different between the different encodings. Derick Rethans 5:43 How could a function tell which encoding a certain text was in? Rowan Tommins 5:49 It's tricky. There are libraries out there that try to do it. Some encodings that are sequences of bits that aren't a valid character. So if any of those appear, it's definitely not in that encoding. Unfortunately, a lot of encodings, every pattern of bits has a meaning. It's just not necessarily mean. So you can't look at the string and just tell at a glance. The only way I've seen that does it effectively, is trying to guess based on what language text it might be in. If your text suddenly has a load of symbols in the middle of sentences, you're probably using the wrong encoding. If it's suddenly got a load of capital letters, in the middle of words, you're probably using the wrong encoding. So you can make guesses like that, that ultimately, there are only ever guesses. Derick Rethans 6:38 It's only always going to be a guess, right? You can't really tell for certain what it it is, which I've seen people assume that she can just tell. We have concluded that utf8_encode and decode don't actually do what they say they don't magically encode everything to UTF 8. What if things go wrong? How are errors handled? Rowan Tommins 6:58 If you're converting from Latin 1 into UTF 8, there Latin 1 covers all 256 possible eight bit binary strings. Those will correspond directly to a single mapping in Unicode and therefore in UTF 8. So there are no errors as such, when...

PHP Internals News: Episode 97: Redacting Parameters Thursday, January 27th 2022, 09:09 GMT London, UK In this episode of "PHP Internals News" I chat with Tim Düsterhus (GitHub) about the "Redacting Parameters in Back Traces" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:00 Before we start with this episode, I want to apologize for the bad audio quality. Instead of using my nice mic I managed to use to one built into my computer. I hope you'll still enjoy the episode. Derick Rethans 0:30 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 97. Today I'm talking with Tim Düsterhus about Redacting Parameters in Backtraces RFC that he's proposing. Tim, would you please introduce yourself? Tim Düsterhus 0:50 Hi, Derick, thank you for inviting me. I am Tim Düsterhus, and I'm a developer at WoltLab. We are building a web application suite for you to build online communities. Derick Rethans 0:59 Thanks for coming on this morning. What is the problem that you're trying to solve with this RFC? Tim Düsterhus 1:05 If everything is going well, we don't need this RFC. But errors can and will happen and our application might encounter some exceptional situation, maybe some request to an external service fails. And so the application throws an error, this exception will bubble up a stack trace and either be caught, or go into a global exception handler. And then basically, in both cases, the exception will be logged into the error log. If it can be handled, we want to make the admin side aware of the issues so they can maybe fix their networking. If it is unable to be handled because of a programming error, we need to log it as well to fix the bug. In our case, we have the exception in the error log. And what happens next? In our case, we have many, many lay person administrators that run a community for their hobby, they're not really programmers with no technical expertise. And we also have a strong customers help customers environment. What do those customers do? They grab their error log and post it within our forums in public. Now in our forum, we have the error log with the full stack trace, including all sensitive values, maybe user passwords, if the Authentication Service failed, or something else, that should not really happen. In our case, it's lay person administrators. But I'm also seeing that experienced developers can make this mistake. I am triaging issues with an open source software written in C. And I've sometimes seeing system administrators posting their full core dump, including their TLS certificates there, and they don't really realize what they have just done. That's really an issue that affects laypersons, and professional administrators the same. In our case, our application attempts to strip those sensitive information from this backtrace. We have a custom exception handler that scans the full stack face, tries to match up class names and method names e.g. the PDO constructor to scrub the database password. And now recently, we have extended this stripping to also strip anything from parameters that are called password, secret, or something like that. That mostly works well. But in any case, this exception handler will miss sensitive information because it needs to basically guess what parameters are sensitive values and which don't. And also our exception handler grew very complex because to match up those parameters, it needs to use reflection. And any failures within the exception handler cannot really be recovered from, if the exception handler fails, you're out of luck. Derick Rethans 3:51 Quite a few things to think of to make sure that you're not sharing any secrets. And I certainly have seen almost doing this myself. We now know what the problem is. How is this RFC proposing to fix this? Tim Düsterhus 4:03 Primarily, we want to propose a standardized way for applications or libraries to indicate which parameters hold sensitive values. Our custom exception handler uses reflection as we said before, and it only matches up the parameter's names, but we also have this attribute I am proposing, SensitiveParameter within our application itself. Any parameter names that are not definitely sensitive can be attributed with this attribute. But this only works within our software, but not with any third party libraries we are using, e.g. for encryption or whatever there is. Primarily we want to propose a standardized way an attribute that is in PHP core, anyone can use that and everyone knows what this attribute means. Secondarily, the RFC is proposing a default implementation to keep the exception handler simple. As I said before, we are using reflection. This is very complex, it does not work with the require_once or include_once family, because that are not functions. We need to handle this case to not try to attempt to reflect on those non functions when redacting any parameters. This is complex. And we want to simplify that. Derick Rethans 5:20 From what I understand this is then a way to make sure that there's a standardized method for marking arguments as being sensitive. And because this is that now standardized, only one solution to the problem has to be found right? Tim Düsterhus 5:34 Basically, not every library is using their own attributes, possibly, or we can match parameter names that are not like password, secret, but it can be documented: hey, if you are using sensitive parameters, you should put this attribute and then those exception handlers will be aware that this attribute is sensitive and can strip it, or in case of the RFC PHP itself, will already strip those parameters from the stack trace. Derick Rethans 6:04 You're suggesting that PHP standard way of showing stack traces also takes care of the sensitive parameter here? Tim Düsterhus 6:11 Yes, exactly. Derick Rethans 6:13 Which internal PHP functions are likely to get this attribute? Tim Düsterhus 6:16 Basically anything with a parameter called password or secret, as I said before, examples include PDO's constructor, the database password will be in there and possibly also the user name or host name, which might be considered sensitive. But the password is the most important thing I have on my list. ldap_bind, which possibly includes user passwords; the password_hash function; possibly various OpenSSL functions. One will need to look and this list can be extended in the future as well, if someone realizes we missed anything. Derick Rethans 6:55 Now, I know sometimes that there's a problem where an application connects to the wrong server with PDO. And as you say, the host name was also in this PDO constructor, would it not then make debugging that specific case harder because the hostname would also be redacted from the stack traces? Tim Düsterhus 7:14 The attribute I am proposing as the parameter attribute, each parameter can be sensitive or non sensitive. We would need to decide whether we consider the hostname sensitive or not. It usually is not. So I would not put the attribute on the host name, or on the DSN string in the first parameter. The password definitely is sensitive. And the username possibly is a grey area. By default, I probably would not put the attribute there. But this is something that needs to be discussed in the greater community possibly. Derick Rethans 7:47 I saw in the RFC that when you request a stack trace in PHP with get back trace or whatever the name of this function is, is...

PHP Internals News: Episode 96: User Defined Operator Overloads Thursday, December 16th 2021, 09:24 GMT London, UK In this episode of "PHP Internals News" I chat with Jordan LeDoux (GitHub) about the "User Defined Operator Overloads" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 96. Today I'm talking with Jordan, about a user defined operator overloads RFC that he's proposing. Jordan, would you please introduce yourself? Jordan LeDoux 0:33 My name is Jordan LeDoux. I've been working in PHP for quite a while now. This is the second time I have ventured to propose an RFC. Derick Rethans 0:44 What was the first one? Jordan LeDoux 0:45 The first one was the "never for parameter types", which was much more exploratory. And we talked about it a little bit. And it generated a lot of good discussion that contributed to kind of the idea formation, which was what I hope to get out of it. Derick Rethans 1:01 Okay, but that didn't end up making it into a PHP release. As far as I understand, right? Jordan LeDoux 1:07 No, I withdrew it actually, it was clear that the better way to approach the problem it was trying to solve was with a much more comprehensive solution. That particular solution was something that only required a seven line change to the engine. So I wanted to see if it was something people were okay with, or thought was a decent idea for that particular problem, much more comprehensive, like template classes, or something like that is probably the better route to go. Derick Rethans 1:35 Well, I think the RFC that we're talking about today, is going to require quite a bit more than seven lines of code? Jordan LeDoux 1:41 Quite a bit more. Yeah. Derick Rethans 1:42 So what is this RFC that we're talking about today? Jordan LeDoux 1:45 Well, user defined operator overloads is a way for PHP developers to define the ways in which objects interact with specific operators. So for instance, the plus operator, the plus sign. It's a way for those objects to kind of define their own logic as far as how that's handled, which right now, as of PHP 8.0, those were all switched to type errors. So it's not possible currently to write any code that doesn't result in a fatal error, where objects are used with operators. Derick Rethans 2:25 Usually, I ask about every RFC, what problem are you trying to solve this? So what problem are you trying to solve this RFC? Jordan LeDoux 2:31 The biggest problem that this solves is that objects contain, so objects in most programs represent a value or multiple values that have a program context. That's the most powerful thing about objects is they're contextual, and they understand the state, they understand what state the object is in, and sometimes even what state the whole program is in. And that's necessary for a lot of things. Like for instance, if you're tracking a distance, you know, you might measure that meters, and that would have a number you might have 30 meters of distance, but it also has a unit of meters. You could just represent that as an int. And then the program just knows internally, hey this is always in meters. But if you need to convert that to a different unit, then that becomes: Okay, well, now I need a special case some things, or I need a function just for converting, and I need to remember which unit my number is in. In a lot of cases, you handle that with objects because objects understand state, and they understand state transitions, which is what a lot of methods are about; transitioning the state of the object from one state to another. Operators are also about state transitions. And they're about very specific kinds of state transitions. It's natural in a lot of ways to think that you, you should be able to define how those two things interact. But currently, it's just not possible within PHP. Derick Rethans 4:00 Well, does them this magic operator overloading? Jordan LeDoux 4:04 It allows PHP developers to define an implementation logic, which is much like you define a function body that describes how does this object interact with this operator. That's essentially it. There's a lot of other details as to how it does that and what are the restrictions, but that's really the core of the idea. Derick Rethans 4:26 And in what kind of situations would you use that? Jordan LeDoux 4:28 A lot of them are situations where you're doing very complicated mathematics, or scientific computing or machine learning or things of that nature, where you are going to routinely encounter numbers that have state to them or that have multiple dimensions to them. So for instance, vector mathematics is one where the way that vectors interact with a lot of the operators that we're familiar with, like the multiplication sign is very different than how the number five interacts with the multiplication sign. Complex numbers is another one, you know, to multiply two complex numbers together, you have to treat it like a polynomial where you're multiplying it with the FOIL method: first, outside, inside, last. You know, there's a lot of those sorts of circumstances. But it also could potentially be very useful for some things that are not really mathematical but more quality of life for PHP developers. For instance, scalar objects is something that a lot of developers in PHP have, you know, wanted for a while. It's a thing that's a little more difficult to pin down, how exactly would you go about doing this within the engine, and it's a thing that the engine would kind of have to be very opinionated about by its nature. PHP developers can't provide their own scalar objects. And the main reason for this is that scalars interact with operators and objects can't. So simply allowing PHP developers to define a way for objects to interact with operators would allow user land to develop their own scalar object replacements. It wouldn't make every scalar that object; scalar objects within the engine still has, it's a separate feature. And it's still a thing that would be desirable, probably to a lot of people. But it gets quite a bit of the way there. Derick Rethans 6:20 It is always interesting that people come up with the example of complex numbers, because I'm not sure how useful that is in a PHP user land context. And then beyond the scalars, I then sometimes struggle to see where this could be used. With the only exception is probably doing calculations with money related issues. The moment you bring up operator overloading, you'll also get people to say that this is going to get abused. Examples of that, in my opinion at least, is where in C++ you have like the << operator to put things into the stream and stuff like that. What answer would you have to kind of comments? Jordan LeDoux 6:58 Abuse of operator overloads to do things that can create unmaintainable code, because that's really the concern for developers is, does a language feature promote code that's difficult to maintain, that's difficult to understand, that's difficult to follow, and develop, and you know, work with. The RFC, the way that I've gone about this implementation, has had that in mind, because I also have experienced that. This is not a thing where I coming down from the academic high tower with, you k...

PHP Internals News: Episode 95: PHP 8.1 Celebrations Thursday, November 25th 2021, 09:23 GMT London, UK In this episode of "PHP Internals News" we're looking back at all the RFCs that we discussed on this podcast for PHP 8.1. In their own words, the RFC authors explain what these features are, with your host interjecting his own comments on the state of affairs. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. Derick Rethans 0:23 This is episode 95. I've been absent on the podcast for the last few months due to other commitments. It takes approximately four hours to make each episode. And I can now unfortunately not really justify spending the time to work on it. I have yet to decide whether I will continue with it next year to bring you all the exciting development news for PHP 8.2. Derick Rethans 0:44 However, back to today, PHP eight one is going to be released today, November 25. In this episode, I'll look back at the previous episodes this year to highlight a new features that are being introduced in PHP 8.1. I am not revisiting the proposals that did not end up making it into PHP 8.1 feature two features I will let my original interview speak. I think you will hear Nikita Popov a lot as he's been so prolific, proposing and implementing many of the features of this new release. However, in the first episode of the year, I spoke with Larry about enumerations, which he was proposing together with Ilija Tovilo. I asked him what enumerations are. Larry Garfield 1:26 Enumerations, or enums, are a feature of a lot of programming languages. What they look like varies a lot depending on the language, but the basic concept is creating a type that has a fixed finite set of possible values. The classic example is booleans. Boolean is a type that has two and only two possible values true and false. Enumerations are way to let you define your own types like that, to say this type has two values Sort Ascending or Sort Descending. This type has four values for the four different card suits, and a standard card deck. Or a user can be in one of four states pending, approved, cancelled or active. And so those are the four possible values that this variable type can have. What that looks like varies widely depending on the language. In a language like C or C++, it's just a thin layer on top of integer constants, which means they get compiled away to introduce at compile time, and they don't actually do all that much they're a little bit to help for reading. On the other end of the spectrum, you have languages like rust or Swift, where enumerations are a robust, advanced data type and data construct of their own. That also supports algebraic data types. We'll get into that a bit more later. And is a core part of how a lot of the system actually works in practice, and a lot of other languages are somewhere in the middle. Our goal with this RFC is to give PHP more towards the advanced end of enumerations. Because there are perfectly good use cases for it, so let's not cheap out on it. Derick Rethans 3:14 In the next episode, I spoke with Aaron Piotrowski about another big new feature: fibres. Aaron Piotrowski 3:20 A few other languages already have Fibers like Ruby. And they're sort of similar to threads in that they contain a separate call stack and a separate memory stack. But they differ from threads in that they exist only within a single process and that they have to be switched to cooperatively by that process rather than pre-emptively by the OS like threads. And so the main motivation behind wanting to add this feature is to make asynchronous programming in PHP much easier and eliminate the distinction that usually exists between async code that has these promises and synchronous code that we're all used to. Derick Rethans 4:03 I also asked Aaron about small PHP I actually have a slightly related question that pops into my head as like. There's also something called Swoole PHP, which does something similar but from what I understand actually allows things to run in threats. How would you compare these two frameworks or approaches is probably the better word? Aaron Piotrowski 4:25 Swoole is they try and be the Swiss Army Knife in a lot of ways where they provide tools to do just about everything. And they provide a lot of opinionated API's for things that in this case, I'm trying to provide just the lowest level just the only the very necessary tools that would be required in core to implement Fibers. Derick Rethans 4:48 Although I discussed several deprecations from Nikita and the last year, I only want to focus on the new features. In episode 76. I spoke with him about array unpacking, after talking about changes to Null in internal functions. Nikita Popov 5:01 The old background is set we have unpacking calls. If you have the arguments for the call in an array, then you write the free dots and the array is unpacked intellectual arguments. Now what this RFC is about is to do same change for array unpacking, so allow you to also use string keys. Derick Rethans 5:24 In another episode, I spoke with David Gebler on a more specific addition of a new function fsync. David explains the reason why he wants to add this to PHP. David Gebler 5:34 It's an interesting question, I suppose in one sense, I've always felt that the absence of fsync and some interface to fsync is provided by most other high level languages has always been something of an oversight in PHP. But the other reason was that it was an exercise for me in familiarizing myself with PHP core getting to learn the source code. And it's a very small contribution, but it's one that I feel is potentially useful. And it was easy for me to do as a learning exercise. Derick Rethans 5:58 And that is how things are added to PHP sometimes, to learn something new and add something useful at the same time. After discussing the move of the PHP documentation to GIT an episode 78, in Episode 79, I spoke with Nikita about his new in initializers RFC. He says: Nikita Popov 6:15 So my addition is a very small one, actually, my own will, I'm only allowing a single new thing and that's using new. So you can use new whatever as a parameter default, property default, and so on. Derick Rethans 6:29 The addition of this change also makes it possible to use nested attributes. Nikita explains: Nikita Popov 6:34 I have to be honest, I didn't think about attributes at all, when writing this proposal. What I had in mind is mainly parameter defaults and property defaults. But yeah, attribute arguments also use the same mechanism and are under the same limitations. So now you can use new as an attribute argument. And this can be used to effectively nest attributes. Derick Rethans 6:59 Static Analysis tools are used more and more with PHP, and I spoke to the authors of the two main tools, Matt Brown, of Psalm, and Ondrej Mirtes of PHPStan. They propose to get her to add a new return type called noreturn. I asked him what it does and what it is used for. Ondrej Mirtes 7:14 Right now the PHP community most likely waits for someone to implement generics and intersection types, which are also widely adopted in PHP docs. But there's also noreturn, a little bit more subtle concept that would also benefit from being in the language. It marks functions and methods that ...

PHP Internals News: Episode 94: Unwrap Reference After Foreach Thursday, August 26th 2021, 09:22 BST London, UK In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about the "First Class Callable Syntax" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is Episode 94. Today I'm talking with Nikita Popov about the unwrap reference after foreach RFC that he's proposing. Nikita, would you please introduce yourself? Nikita Popov 0:33 Hi, Derick. I'm Nikita and I work at JetBrains on PHP core development. Derick Rethans 0:38 So no changes compared to the last time. Nikita Popov 0:41 Yes, at the time before that. Derick Rethans 0:43 So what is the problem that is RFC is going to solve? Nikita Popov 0:46 Well, it's really a very minor thing. I think it's a relatively well known problem for the more experienced PHP programmers. It's like a classic example, you have a foreach loop by reference. So foreach array as value by reference, and then you do a second loop after that, foreach array as value at the same it's by value. So without the reference sign. The result of that is that your last two array elements are going to be the same, which is kind of unexpected. If you're not familiar with how references in PHP work and scoping in PHP works. So I think it's worth explaining what's going on there. Derick Rethans 1:27 Can you quickly explain the scoping or rather the lack of it, I suppose? Nikita Popov 1:31 Yeah, it's really the lack of PHP really only has function scoping. So if you have a foreach array as value, then the value variable is going to stay alive, even after the foreach loop. And usually, that won't make much of a difference. So you will just have like reference to the last element of the array, might even be useful for some cases, you know, before we added the array, I think, array_key_last function. If the last element now is a reference, so if you have a reference to the last element, then you're write into that variable is also going to modify the last element of the array. So if you now have a second foreach loop, using the same variable, that's actually not just modifying that variable, but it's also always modifying the last element of the array. Derick Rethans 2:15 Okay, just to clarify, it isn't necessarily the last element in the foreach loop. It's the last one that's been assigned to? Nikita Popov 2:22 Yeah, that's, that's true. Derick Rethans 2:24 Is this not something that people actually use for some useful reasons? Nikita Popov 2:28 As mentioned before, technically, you could use it to get a reference to the last element and then modify the last element outside the foreach loop. I don't think this is a particularly common use case. But I'm sure people have used in here there. This is a use case we would break with the proposed RFC. Derick Rethans 2:47 I think it is one I have used in the past, it's probably not how I would do it now. But I'm pretty sure I have some point in the past. What are you proposing to change with this RFC? Nikita Popov 2:57 The change is pretty simple. And that's to unwrap or to break the reference after the loop. You will still have like after the loop, the variable will still contain the value of the last element, or of the last like visited element, but it will no longer be a reference to it. If you write into the variable, it will not modify the original array. And if you have a second loop that writes into the variable that also doesn't modify the original error any more. Derick Rethans 3:25 At which point and how is this reference broken? Nikita Popov 3:29 It's at the end of the foreach loop, or as you say, if you break out too early, then of course, it would also get broken. So it's referenced inside the foreach loop and stops being referenced outside the loop. Derick Rethans 3:41 And that would happen also, if I would use a goto for example? Nikita Popov 3:45 Oh, that that's a trick question, actually, yes, it should happen. But now that you have mentioned it, I think my current implementation does not handle that particular case, I will have to double check it. But that should happen, yes. Derick Rethans 4:00 It's good to know that you've thought about it then. Nikita Popov 4:02 Well, I didn't think about it. Because I mean, I guess I can mention it here, the way this works is that well, at the end of the foreach loop, we have like an instruction that frees the loop variable. And I can just add an additional one that breaks reference. But if you use things like goto or multi level breaks, or something like that, then we insert these clean-up instructions before the jump. We have to make sure to actually insert the reference breaking instruction there as well. So it's like not automatically handled. Derick Rethans 4:38 Is this going to be a separate instruction or as we tend to call them opcodes? Nikita Popov 4:43 I'm using a separate one, but one could run it as a flag into the instruction that frees the loop variable, but I think it's cleaner to have a separate instruction for it. Like technically one could optimize it away in some cases, like I wouldn't bother but it's like semantically a different thing. Derick Rethans 5:01 I think it'd be nicer result, because it makes it easier to visualize what's happening, right? Nikita Popov 5:06 Yeah, it is. Derick Rethans 5:07 Did you actually check whether some code uses this construct? Nikita Popov 5:10 I have to admit, I tried checking it using a very basic approach, just look at foreach loops by reference. And then if the variable is used after that. But that kind of primitive approach has way too many false positives, for example, you have a foreach loop inside, and if, and then the variable is reused inside an else. So it like wouldn't flow from the if into the else. So you would have to do some kind of more sophisticated control flow analysis. It's something that can be done, but I didn't bother doing it for a one off backwards compatibility check. So I don't have any hard data on how much code is actually using something like this. Derick Rethans 5:51 So this is where I'm a little bit on the fence about this change, because it is changing behaviour, that's going to be pretty hard to figure out what is actually going to affect your codebase. Nikita Popov 6:01 ...