Friday, April 24, 2015

xNormal 3.19.0

- The UI frontend is now a little bigger
- Added 3dsmax / Maya 2016 support.
- Now you can invoke the obj2sbm and object-tangent space tool from the command-line.
- The installer now uses modern UI.
- Fixed a lot of bugs and improved compatibility with Quadros.
- Recompiled using VS2013CEu4 and the latest libraries, including FBX 2016, Cuda 7/Optix 3.7

Thursday, January 08, 2015

ratGPU 0.8.0

- Added an option in the standalone renderer app to re-install the Max/Maya's plug-ins.
- Optimized a 2% the speed in some cases.
- Removed the dependency on TBB.
- Adjusted the bias parameter a bit more.
- Added the license to the Help/About dialog.
- Tweaked a bit the random noise generator.
- Solved a problem removing the app from the Windows's installed programs list.
- Updated a bit the Windows Installer's UI.
- Removed the installer for MacOSX. Now all is inside the .app.
- Recompiled using VS2013 CE Update 4 and XCode 6.
- Recompiled using the latest libraries ( Qt 5.4.0, NSIS 3.0b1 )

Monday, October 06, 2014

xN4 delayed to 2015

Sorry, but we're delaying xNormal 4 to 2015-H2 because:

1. We were expecting solid OpenCL 2.0 drivers some months ago, but it didn't happen. We're taking B plan meanwhile they mature : CUDA 6.5

2. Need hardware tessellation support in Macs ( OpenGL 4.2 ). Currently, only 3.3 is supported ... Also, clang has some problems with OpenMP atm.
We algo want to use OpenGL 5.0 ( aka glNext aka Vulkan ) and DX12.

3. Some 3rd-party libs still have problems compiling due to C++11/14. We have submitted some bugs and we hope they could fix them soon.

4. Need at least Qt 5.4 on Ubuntu 14.04LTS installed from standard repository. Currently only 5.2.1 is supported.

5. We want to add ARM support too.

6. xNormal 4 will use a completely new core based on ratGPU 2.0.

7. We're addicted to WoW ( new expansion coming out! ) and Interstellar's trailers. Seriously.

Monday, August 18, 2014

xNormal 3.18.9 and ratGPU 0.7.3

For xNormal, fixed a bug position-offseting the cages.
Also, recompiled with latest JRE and Optix 3.6.2 which should support Maxwell cards.

For ratGPU, fixed a bit the bias parameter so the self-occlusion problems should be fixed on Mac now. Also recompiled with the latest Qt, VS and tbb libs.

Now I'm 100% concentrated in xN4 !

Tuesday, June 03, 2014

xNormal 3.18.8 and ratGPU 0.7.2

Added 3dsmax/Maya 2015 support and fixed some bugs for xN 3.18.8.

ratGPU 0.7.2 is now compiled with the latest libraries and compilers ( gcc 4.8.2 / VS2013-u2 ) for maximum performance. Also, now it renders two tiles per OpenCL device in order to maximize the speed ( that's the "HT" signature on the device list ... aka GPU 'Hyperthreading' :D )

Thursday, January 02, 2014

xNormal 3.18.6

xNormal 3.18.6 released:

- Now the result from the base texture bake is linearly filtered, resulting in better quality.

- Modified the dilation filter's algorithm to consume much less memory. Also, modified the internal image structs to be more SIMD-friendly.

- Increased the maximum render size to 32k x 32k.

- The 3dsmax SBM exporter now saves data from current frame instead of using the first frame.

- Now you can render a new map : the translucency map, which can be used to simulate semitransparent objects and SSS.

- Recompiled using the latest libraries ( FBXSDK 2014.2, JRE 1.7u45, libpng-1.6.7, lua 5.2.3, OpenEXR 2.1.0 ).

Thursday, October 10, 2013

ratGPU 0.6.0 and xN 3.18.4 released

For ratGPU 0.6.0, I've optimized it a bit more for Radeon cards and recompiled it using the latest libraries. Radeon 7990 is the new king.

I've also released xNormal 3.18.4 which corrects some bugs.

Sunday, September 01, 2013

Some things I will never understand

1. Apple, the OpenCL's founder, does not support OpenCL for the iPad ...
This will only not help to compute faster but also will help to be more power-efficient...

2. Have you noticed that the current OpenGL version is 4.5? Then... why Apple's OSX barely supports 3.2? Also, they set the max vertex count to 150k and index count to only 1M. It's almost impossible to render dense meshes efficiently with those limits!

Also, I will never understand why OpenGL's group decided to use source-coded shaders. It's much better to use precompiled ones like Direct3D or OpenCL's SPIR does, separating the compiling from linking and allowing to use virtual functions/interfaces so the user can inject closed-code functions there.

3. Why C++'s ABI is not standarized yet? If you write a plug-in or component system it's a fucking nightmare to make it compatible across different compilers. Seriously, is it so difficult to meet the major ISVs there ( Oracle, IBM, Microsoft, GNU, Apple LLVM, Intel, etc... ) and to decide a  standard C++ ABI based on Itanium64 or LLVM ?

4. Why Android is still not massively introduced in the desktop computers?
There is a small project called , but lacks lots of hardware support. I simply cannot understand why Google does not bump that !

5.We like the start button and XP's style! Is that so hard to understand, Microsoft? A desktop PC is NOT a touch screen, it's not a mobile phone and neither a tablet. We use mouse and keyboard ! In fact, the first two things the people do after installing Windows 8 are:

1. Install any of the available start button hacks ( Pokki, IObit, etc... )

2. Remove all the damm metro apps with these Powershell's commands:

Get-AppxPackage -AllUsers | Remove-AppxPackage
Get-AppXProvisionedPackage -online | Remove-AppxProvisionedPackage –online

Monday, July 22, 2013

OpenCL 2.0 spec released !

Khronos just released the OpenCL 2.0 / SPIR 1.2 (provisional) spec !

  • Shared Virtual Memory
    Host and device kernels can directly share complex, pointer-containing data structures such as trees and linked lists, providing significant programming flexibility and eliminating costly data transfers between host and devices.
  • Dynamic Parallelism
    Device kernels can enqueue kernels to the same device with no host interaction, enabling flexible work scheduling paradigms and avoiding the need to transfer execution control and data between the device and host, often significantly offloading host processor bottlenecks.
  • Generic Address Space
    Functions can be written without specifying a named address space for arguments, especially useful for those arguments that are declared to be a pointer to a type, eliminating the need for multiple functions to be written for each named address space used in an application.
  • Images
    Improved image support including sRGB images and 3D image writes, the ability for kernels to read from and write to the same image, and the creation of OpenCL images from a mip-mapped or a multi-sampled OpenGL texture for improved OpenGL interop.
  • C11 Atomics
    A subset of C11 atomics and synchronization operations to enable assignments in one work-item to be visible to other work-items in a work-group, across work-groups executing on a device or for sharing data between the OpenCL device and host.
  • Pipes
    Pipes are memory objects that store data organized as a FIFO and OpenCL 2.0 provides built-in functions for kernels to read from or write to a pipe, providing straightforward programming of pipe data structures that can be highly optimized by OpenCL implementers.
  • Android Installable Client Driver Extension
    Enables OpenCL implementations to be discovered and loaded as a shared object on Android systems. 

I like it.. more of less :D

  • The shared memory would be very useful to deal with super-large resources. It also will save lots of memory for APUs/SoC because the data won't need to be replicated.
  • Dynamic parallelism is good also to avoid CPU host intervention to fire kernels.
  • Images can now be written and read at the same time and also created from a 2D buffer.
  • Atomics ( specially floating point ones ) are always welcome !
  • Pipes can be interesting. I like the stream appoach, it may be useful.
  • The Android ICD is also very welcome, but I highly doubt Google would permit that anyways because they're Renderscript-ninjas and haters :p

But I think some important things are missingfor my taste:
  • A flag to indicate the task could take a lof of time to complete, so the implementation could disable the f$%@ing Windows's watchdog.
  • Multi-sized image arrays, so an image array could contain several images of different sizes. 
  • C++ template and simple virtual/abstract methods support.
  • Compressed textures support.

I think SPIR is also very critical. The IHVs should adopt it as soon as possible because:

  1.  Most of the enterprises aren't using OpenCL because they don't want to distribute their kernel's source code with the app.
  2.  On-the-fly kernel's source compilation can take a lot of memory and time. It's much better to pre-compile the kernels offline as DirectX or CUDA does.

And, yes, of course, xN4 and ratGPUv2 gonna take advantage of this ... very soon :D