Also, I didn't see what you have said in the given link. Can you quote the part about stdcall and fastcall from that please?
I think you already understood the difference between __stdcall and __fastcall.Two different calling convensions.
64 bit architectures are using __fastcall. Since you not hack into the header files of the SDK then it won't
be a problem. Yes calling convensions are not specificat to a computer hardware architecture, while linux in 32 bit
also uses __fastcall. So your correct in that,yes it's specific to the platform.
what I said is the "source code" compatibility of older 32-bit source bases and 64-bit source bases.
In a 32-bit computer, the width (or size) of the data bus is 32-bits wide.
No, when you saying data bus is 32-bit's it don't have any meaning. Because CISC computing allows you to read
data from memory which are not aligned into 32-bit alignment. In modern computing data bus refers to the
micro architecture data bus, and with technologies like hyper-threading , deep data pipelines and caches ,
even inside micro-architecture you can't clearly say it's 32-bit.You can argue registers are 32-bits, yes it's
a good argument.and if you talk about physical memory address bus , it's more than 32-bits wider clearly.
DDR which mean dual data rates like technologies have exceeds that 32-bit limit before a long time, even in PIII
times. (I can't remind precisely when DDR III started to dominate the market).
The difference is actually on the address bus, actually it's virtual than the physical, virtually in programming we refer pointers 32-bit when we programming with 32-bit machines and pointers will be 64-bit when we programming with 64-bit machines. Even through addresses are 32-bit or 64-bit , it does not say it's necessary to physical address bus to be 32-bits or 64-bits, In 64-bit computing there is something called , PAE (Physical Address Extension) ,normally memory management routines hides everything for us even in the kernel space we can nicely
have 32-bit raw pointers. So yes that means there can be more than 32-bit address bus width in even 32-bit machines. Theoretically with PAE 4G limit can exceed to 64G limit.
Your right in that 'long long' 'long double' thing. And I didn't copy it anywhere,but for the links you asked,
I just google, and here is the first link.
http://en.wikipedia.org/wiki/Long_double
yes you right, it was not covered under the ANSI C so thousand times discourage to write something like.
and it says that it clearly violates that IEEE 754 standard. However microsoft even in 32-bit SDK/Compilers are violated that standard. Microsoft engineers telling ,
"Do favor floating point calculations instead of memory reads"
also one of main design goals decisions of both OpenGL and DirectX implementations.
Which means floating point calculations are less heavy than cache misses !
I know and you haven't read my post clearly. Notice the following statement.
I do read your statement , what i tried to tell is there 64-bit double precisions are already there because
FPU is always built inside 486DX above microprocessors. It may apply to those extended IEEE 754 but not
to the current source bases.
Also, do you know that floating point costs the CPU far more than integers calculations?
yes but not a fact. As I already told, do favor calculations instead lots of memory reads. It's right on early implementations , today you need to favor that over memory traffic. as I told above.
If not why optimization tools like Intel Parallel studio have implemented that as a optimization policy?
"reduce memory traffic over Floating point macro optimizations" , where macro optimization is either not
suitable in source level because it adds more complexity to the source code.Don't ask me how Studio
do this? I simply don't have enough brain to understand it. :{
How if a user use AMD64 ASM codes within C code with 'asm' keyword? On Win32, it will crash the system. So in my opinion, all slight differences should take in to consideration which are specific for each OS/hardware including pointers.
Are you asking how should we implement it?well use preprocessed macros.
32-bit inline assembly segment won't be injected into it's object code if it detected it.Anyway using a shell code
method you can still write it in "Machine code" rather than assembly code.But the fact is I didn't talked here
about assembly code, inline assembly is not part of ANSI C, so there is no point of talking inline assembly,
I'm talking about a ANSI C compatible source base.
My intention was not to disprove all you said but to express the right information (based on facts - I'm not going to use my 20+ years of C/C++ programming expereince

) with logical arguments. If you are grabbing information from other sites, please don't forget to mention the source. Otherwise you'll not join the discussion from here which is unfair
You are correct. I'm still a university student who failed in my job. But facts are facts.
don't hesitate to quote my post and say "Hay that point is wrong," because I want to learn.
but I didn't copy and paste here. There is no point of duplicating information over the net. And the facts that I mention here wasn't
invented by me, btw ,unfair is everywhere in the universe , however non of above are inventions of glemine.
If you want links just google. I can just put the "
http://google.com/" in general [because google is keeping
a 4 hour old cached version of most of sites].
Oky anyway you told that it will be faster precisions when we use 64-bit , can you give me a example
source code , quoted from a open source code base or something your own. I have reviewed the
factial generator source base, and they are even not using long double either. But performance difference
do exists in 64-bit when we took the proper benchmarks, may be when we can access more than 4G limit
memory manager pages less. What's your argument on that?