One in ten AVMs out by 25%
By Kent Lardner for Lending Central
The Wall Street Journal did an analysis of the Zillow valuation model for 1,000 home sales in early 2007. It found that “the median difference between the Zillow estimate and the actual price was 7.8 percent.”
According to The Wall Street Journal test results, when it was wrong it was very wrong, off by 25 percent for one in 10 properties. This is certainly the case here in Australia too. You could be testing a model and find the first 9 properties return amazing results, all within a few percent of the sale price, then the next one could be 20% or more off target. It’s these few large errors that have such a significant impact on the forecast standard deviation (error estimate).
A handful of base methodologies exist in calculating the automated estimate, with an unlimited number of proprietary algorithms and user interfaces used between firms. Each firm offering these services works hard to differentiate based on advancements in model design as well as the quantity and quality of data offered.
Example AVM base models - overview
Index: The simplest method yet vastly superior to medians is the repeat sales index. Google ‘repeat sales index‘, or ‘Case-Shiller’ and you will find several white papers on how it works and how to build this model. All you need is a sales history, property type and address, which is all readily available via government data sources.
The index at a basic level works fine for typically up to 10 years. If you are confident the last sale or valuation amount was right, the Index method applies a market adjustment to calculate a current value. If you are not confident of the last sale price or valuation amount - don’t use it. Units and townhomes often work better than houses, as houses are often given expensive renovations which can artificially inflate the index.
Improvements to the index method can be achieved through filtering outliers, segmentation of the market and tracking changes between the repeat sales and removing properties that are known to have changed over time. Geo-coding has also assisted the repeat sales index in recent years.
One tip for lenders testing AVMs is to watch out for the use of the Index method. A few years back when testing an AVM provider for GE Mortgage Insurance we found a surprising number of AVM estimates within 1%. We found out that for any period up to 1 year, the AVM provider simply used the sale price. For the next 2 to 3 years they automatically applied the index method. That’s fine, as long as you know that’s what method is being used. We wanted to validate the sale price, so it did not suit our needs at the time.
Regression models: This has been covered in a number of previous articles for Lending Central, but at its simplest level we are creating a linear model based on land size, bed, bath and parking count as well as a location score to estimate the price. Each independent variable that enters the model has a coefficient value (e.g. $180 per sqm for land area) which is used against the subject property description to estimate the price. Smaller samples often cause problems, and relatively small and large values for any subject property variable can also produce errors.
Some providers have applied a number of advances to this basic model to optimize the results. Within the regression model software itself we can flag individual property sales that stand-out (known as influential observations). These can be removed from the final sample used.
Other advances include fuzzy logic or artificial intelligence (AI). By applying a set of rules and weights, the system can test itself against an array of ‘if / then’ rules in an attempt to find the best model for a given property type or location. AI can also be used as a stand-alone model, however many research papers conclude it falls short of other methods and that it is too resource intensive.
Case Based Reasoning: This approach attempts to copy the approach used by valuers. By using a series of algorithms and a common method often referred to as ‘nearest neighbor’, properties can be automatically matched to a subject property. The matching criteria are typically the same set of criteria used with the regression model including distance and time.
What is different to traditional valuations is that the sale price is usually not used in the comparable selection process making it less prone to bias. This is one of the advantages of using AVMs as a checking tool against full valuations. When an AVM is used as a QA tool to process full valuations, lenders will find a small percentage difference in error rates between models is much less critical.
The next step in the CBR method is often called the grid-adjustment process. For a more detailed outline Google ‘Generalizing the OLS and Grid Estimators’ and about a dozen articles will result. Here we adjust for any size difference between properties that we have shortlisted using our automated matching process.
One of the greatest advantages of the CBR method is how well it suits a user-assisted model as well as an AVM. In many articles on AVM theory or statistical modeling, the problem of ‘omitted variable-bias’ is often mentioned. In practical terms, this is typically anything other than location score, land size or house size that may add or subtract from the price estimate that ideally we should be using. What the computer can’t measure it can’t use, and the omitted variable-bias is something that impacts all AVMs.
Reducing errors: Put simply, there is a residual error caused by the imperfections in data and limitations within any model design. This residual error in most cases, specifically here in Australia, is caused by significant differences in the quality between properties. The final set of comparable sales can often return similar sized properties in a given local area, all with very different street appeal and wide ranging values. One could be renovated with views and the others could all be knock-downs. By comparing the quality differences between properties, we can adjust this residual and instantly improve the results for any poor performing AVM estimates.
So let’s go back to our Wall Street Journal findings on errors. Applied to an AVM, we have the same problems here in Australia. Providers can spend millions each year in an attempt to reduce the error rate by just 1% or 2%. But unless the quality differences can be reliably measured for all properties nationally and successfully applied to a statistical model, we should still expect 1 in 10 returning values that are way off target.
The easiest step in reducing the FSD is to target the extreme error values - just like the Six Sigma practitioners do. So if it’s a low FSD you need, that ‘1 in 10′ property needs to be managed. That is where human judgment comes in. To borrow a quote from the American Appraisal Institute;
” Those who are able to integrate the best elements of human judgment and computer technology will be the model of the appraisal profession in the next century, and users of such services who understand the techniques will process sharper, more reliable decision-making abilities”.
Allowing a human to adjust for the quality differences between the comparables and even removing and replacing unsuitable comparables is the key to overcoming the limitations of these models. This can be a panel valuer or a trained staff member within your financial institution. To further manage the risk, the original value estimate of the AVM should be stored and compared to the adjusted result.
Summary: AVMs are increasingly being used at various stages in the loan process and applied more as a security risk tool. Up front they offer an instant estimate that can be used in a number of ways. This includes the automated selection of a full valuation or desktop valuation. Further on in the process they are also an excellent tool for auto-approving valuations or indentifying a high risk security. In both applications, the error rate is less critical than when an AVM is being used as the primary valuation tool. To achieve reliability and accuracy for valuations, it’s remains hard to replace the human touch.
Next time you have an odd AVM result and would like to see how a user assisted model can work, please email me; kent.lardner@pricefinder.com.au. Without obligation, I can demonstrate how a user-assisted model can work and return an estimate with minutes.
Post a Comment






