TI-IT::Tech-Ideas for IT: 2009

Friday, October 23, 2009

Mono 2.4 has been released!

Sometimes, when other boring things let me have a rest and my mind falls on it (thanks to my friend Roger this time), I like to have a look to the mono project.

I won't discuss the maturity or suitability of this platform in comparison to Microsoft Windows based native solutions (beware of Silverlight!!!), but there is something in the self-presentation you can read at the mono's site that leads me to meditation (or being more dotnetian "reflection", ;)).

That is:
Mono 2.4 has been released! The Mono Project aims to make developers productive and happy: Mono 2.4 is our gift to the world. Sponsored by Novell , the Mono open source project has an active and enthusiastic contributing community and is positioned to become the leading choice for development of Linux applications.

Microsoft technologies are losing every day more and more terrain in projects to be done in public administrations and governments, terrain that is gained by other Open supposed technologies in the name of freedom, gratuity, etc. (Is Java from Sun so Open? Which RDBMS are used in those open projects (often Oracle)?, we are often facing mere electoral reasons).

Moreover Java is the preferred language taught in universities (Microsoft starts losing the war from the first battles as the fresh working flesh arrives to the market without knowledge on dotnet).

So if the aim of Mono of becoming the leading choice for development of Linux application were reached, they would be achieving a big part of what Microsoft hasn't yet, that is, to spread dotnet and gain adepts across the world of IT solutions.

Saturday, March 07, 2009

IDisposable & "using" statement in C#

Are you using the using statement in your C# code?

What is it intended for?

Let's have a look to the official reference: Provides a convenient syntax that ensures the correct use of IDisposable objects.

And the sample code...

using (Font font1 = new Font("Arial", 10.0f))
{
byte charset = font1.GdiCharSet;
}

And the "must read" Remarks: File and Font are examples of managed types that access unmanaged resources (in this case file handles and device contexts). There are many other kinds of unmanaged resources and class library types that encapsulate them. All such types must implement the IDisposable interface.

As a rule, when you use an IDisposable object, you should declare and instantiate it in a using statement. The using statement calls the Dispose method on the object in the correct way, and (when you use it as shown earlier) it also causes the object itself to go out of scope as soon as Dispose is called. Within the using block, the object is read-only and cannot be modified or reassigned.

The using statement ensures that Dispose is called even if an exception occurs while you are calling methods on the object. You can achieve the same result by putting the object inside a try block and then calling Dispose in a finally block; in fact, this is how the using statement is translated by the compiler (as you can read in the official reference).

It also explains that you can declare and instantiate more than one object in the same using statement. And it also reminds us that, although possible, it is a bad practice to instantiate an object before the using statement in order to pass it to the using statement, as such an object would exist after the using's scope, but its unmanaged resources would be disposed, what in practice supposes invalidating the object, and creating a situation prone to errors.

Let's assume that all of us believe that using the "using" statement is a good practice (not everybody thinks the same, you can find here different opinions on this matter), a rule as said by Microsoft's reference, and therefore, it should be always applied in our code (including those cases in which we can see a wizzard's generated code with an empty "Dispose" method. Perhaps in future versions of this code it won't).

But the question is, how can a programmer be aware of disposable classes in order to code the "using" statement?

An alternative is knowing it by your own, based on your knowledge and mastership on the .Net Framework... a non very realistic approach, taking into account that you shoud make it extensible to any framework, class library or piece of code that gets into your hands...

You can also take advantage of Visual Studio's Intellisense or take a look to the class browser, in order to see if a class to be used by you implements the "Dispose" method. Extra work. At least, it can be a way of improving your mastership, ;).

So, there's not a systematic and reliable method to be adviced when you forget disposing your objects??? Yes, there is, or we'd better say, there was...

In VS2005's Code Analysis (aka FxCop) there is a rule explicitly intended to obtain the desired help:

CA 2000 - DisposeObjectsBeforeLosingScope

... a rule gone with the wind and no longer present in VS2008, along with a few more, as you can read in Neno Loje's blog, :(.

Read about the reasons in the Visual Studio Code Anlysis Team Blog, you'll see that this rule has disappeared with the removal of one of the analysis engines (you'll also find there an availability matrix of the rules in Excel format for the different versions of VS and FxCop):

Analysis engine removed. In Visual Studio 2008 and FxCop 1.36 we removed one of our analysis engines. This engine was removed for a variety of reasons; it increased analysis time (although the engine encompassed less than 5% our analysis, it took up 50% of our time-to-analyze), indeterministic results (results appearing and disappearing between runs), and bugs found within the engine (and hence the rules that depended on it) required huge architectural changes. We instead decided to invest the resources that we would have spent on fixing the old engine, on a new data flow analysis engine based on Phoenix, which we will ship in a future version of Visual Studio.

Not very pleasing news... we'll have to wait... perhaps third party's products? umh..., :(.

Saturday, February 21, 2009

NUMBER vs NUMBER(p, s) (Oracle 11g)

Or how to choose one of them for the numeric fields of your tables...

As I am sure you know, Oracle (11g RDBMS) offers the NUMBER datatype as the main choice to store numerical values in your tables (equivalent to decimal/numeric type of SQL-Server).

Although this is not the main purpose of this article, I assume that you know the difference between decimal precision datatypes (such as NUMBER) and binary precision datatypes, such as BINARY_FLOAT and BINARY_DOUBLE, also present in Oracle, and that you have decided that the real numbers of your application domain need a decimal precision storage -as you can read on Oracle's Reference, binary precision enables faster arithmetic calculations and "usually" (this is one of the key tip of this article) reduces storage requirements. But BINARY_FLOAT and BINARY_DOUBLE are approximate numeric datatypes and they store approximate representations of decimal values, rather than exact representations. For example none of them can exactly represent the value 0.1 ( don't believe it? Try it ,please) and perhaps this cannot be acceptable for the banking or tax application you are developing.

As a decimal type, NUMBER allows you to indicate the precision (total number of digits) and scale (number of digits to the right of the decimal point) when defining a field of this type (NUMBER(p, s)).

Let's view some examples:

NUMBER(9, 2): Nine significant digits in total (precision) of which 2 (scale) may be used for the decimal part of the value (digits to the right of the decimal point).

NUMBER(9): Nine significants digits in total, none of them for the decimal part. Yes, that's the way to restrict your fields for integer values storage.

NUMBER: "I will save whatever you give me" with an accuracy of up to 38 significant digits.

NUMBER(*,2): You set no limit to the precision but reducing (rounding) the decimal part to two digits.

NUMBER(9,-2): Nine digit for the integer part which will be "rounded" at the last two digits (interesting), i.e.: 987,654,321 -> 987,654,300.

Most of people would stay happy with the lazy definition of NUMBER (without p or s). But this is not our case, and when defining the accuracy of a NUMBER, we will be considering at least two objectives:

a) To restrict the entry of data: If we specify precision and scale, we are adding a restriction that allows us to establish a greater shielding on the data (the more "downstream" the better, and the shield will apply to any application developed over this database).

Problem: It is vital to know precisely in advance the needs of the field, which is not sometimes easy. For a field which is, for instance, intended to hold the surface of a construction in a cadastral application, precision and scale could be set without further problems (usually a two digit scale for area values in square meters).

But what precision and scale should be assigned to a coefficient K that can be fixed arbitrarily by a per year shifting taxation law? Perhaps what today is a ratio of two decimal digits, tomorrow will have six, causing to have to redefine the structure of the table every year with the usual associated impact in a productive environment.

b) The saving of disk space: It is common thinking that if you reduce precision the needs of storage cost will be reduced in the same meassure, and therefore you will save disk space. Is this true?

According to Oracle's Reference:

Oracle Database stores numeric data in variable-length format. Each value is stored in scientific notation, with 1 byte used to store the exponent and up to 20 bytes to store the mantissa. The resulting value is limited to 38 digits of precision. Oracle Database does not store leading and trailing zeros. For example, the number 412 is stored in a format similar to 4.12 x 10², with 1 byte used to store the exponent(2) and 2 bytes used to store the three significant digits of the mantissa(4,1,2). Negative numbers include the sign in their length.

Taking this into account, the column size in bytes for a particular numeric data value NUMBER(p), where p is the precision of a given value, can be calculated using the following formula:

ROUND((length(p)+s)/2))+1

where s equals zero if the number is positive, and s equals 1 if the number is negative.

Zero and positive and negative infinity (only generated on import from Oracle Database, Version 5) are stored using unique representations. Zero and negative infinity each require 1 byte; positive infinity requires 2 bytes.

That is, the size in bytes (see above in red) is variable and depends on of value stored in each case!!!

Let's try it:

SQL> create table tbl1 (
as_number number(12)
);

Table created.

SQL> insert into tbl1 values(20000000);
SQL> insert into tbl1 values (12345678 );

SQL> select as_number, vsize(as_number) from tbl1;
....

AS_NUMBER VSIZE(AS_NUMBER)
-------------- ----------------------
20000000 2
12345678 5

It is interesting to see that storing a value like 20000000 only requires 2 bytes, one for storing the mantissa (2) and another for the exponent of 10 (in this case 7). As said, the number of bytes used is dependent on the stored value.

Therefore, the first conclusion to be obtained is that the consumed amount of bytes is a function of the stored values. And as a result of the above you'd probably think that if you don't specify precision and/or scale, it will not have negative effects regarding to disk occupation (and neither on the performance, especially if the field has an associated index), uhm... really?

Be careful, this can be accepted as valid when stored values are integer, but it is not always valid when stored values are real numbers, because, for instance, the result of a division made in an UPDATE sentence between two fields holding real numbers, which must be stored in another NUMBER field, could lead to a full occupation (38 digits) of its available size in case that the result of that division produced further decimal digits (unlikely to be needed). Of course, this can be avoided using a Rounding function, but we don't want to rely on every current or future programmers who will evolve the system.

You can read about this in this article, in which it is explained in a detailed manner.

In this reading you will see that specifying the scale is highly recommended, because it allows to refine the decimal part, the main source of potential space "leaks".

Even when the type of the values to store is integer, it is a good practice to define such fields as NUMBER(*, 0), if we do not want put a limit on the precision or as NUMBER(p, 0) if we want to limit the number of input digits, as we ensure that so defined fields will never allow anything but integers (and that they will spend only the needed disk space to store the specific values that are inserted into them).

So, in order to avoid space leaks you should specify scale (although as said, if you can guarantee that your values will be integer, the problem won't exist)... provided, of course, that you have all the information needed to decide how to specify it...

General summary:

Whenever it's needed to shield what can be admited in a field in regard to maximum values and/or decimal digits, I recommend doing so in the definition of the field, indicating precision and scale NUMBER(p, s).

Regarding disk storage saving, it is important to set the scale specially for the results of real number calculations, because otherwise you are in risk of occupying all the capacity of NUMBER fields. However, if the data to be introduced is known to be always of integer type (a primary key based on a sequence (+1), for instance) we will not suffer such negative impact.

Note: This article is a refactored English version of a previous post on Tracasa's wiki by the same poster.

Sunday, February 01, 2009

TI-IT::Official Google Blog: "This site may harm your computer" on every search result?!?!

I didn't notice it but as you can see it did affect to every web site on the Internet...

Here you are the official explanation (human error):
Official Google Blog: "This site may harm your computer" on every search result?!?!

Saturday, January 31, 2009

Did they go mad at Google?

I was just searching references for the Microsoft Press "MCTS Self-Paced Training Kit (Exam 70-536)" book with Google, and surprisingly for me all the results had beneath them the following advise: "Este sitio puede dañar tu equipo" (original text as I was searching in Spanish), that is something like "This site might damage your computer" (using the keywords MCTS, self-paced, 536).

And I was so surprised because among others, I could find in such "damaging" site list Microsoft's msdn.microsoft.com or www.amazon.com as well.

Moreover, when I clicked on any of them, I was redirected to another Google's page where I was advised to not to go to the searched page (or if I did it, it would be under my responsibility ;)).

My first reaction (I should be more quiet, I know), has been to change my default search engine from Google to "Live Search"... Next thoughts: Responsibles of those firms wouldn't be very happy if they notice that their sites are not reachable through Google (as all we now, nobody uses it...).

Finally, two minutes after, I repeated the same search (same keywords) in Google, and then the unexplainable took place: The same results appeared now without the evil advise.

Can you understand that? Are their developments properly tested at Google before publishing them on the Internet? How much business of those companies may have been affected in this lapse of time?

3D Modelling

Cuando hablamos de catastro 3D, deberíamos tener en cuenta estos principios:

Existencia de la necesidad de registrar objetos catastrales (unidades catastrales, bienes) en tres dimensiones, porque las tradicionales dos dimensiones no son suficientes.
Un objeto catastral 3D debe considerarse pues como un volumen, frente a una superficie (representación habitual).
No siempre va a existir un case perfecto entre el concepto de planta y el volumen ocupado por un objeto catastral.
El volumen ocupado por una unidad no tiene porque ser un cuerpo simple, aunque normalmente será una agrupación de cuerpos simples regulares.
Dentro de un volumen pueden existir plataformas que deben considerarse ya que suponen un aprovechamiento de superficie (¿también se podrían considerar como volúmenes virtuales cuando algun lado queda abierto?, no me gusta).
De aplicación general: Edificaciones, túneles, cuevas, etc.